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Quantum Computer Science 

An Introduction 

In the 1990s it was realized that quantum physics has some 
spectacular applications in computer science. This book is a concise 
introduction to quantum computation, developing the basic elements 
of this new branch of computational theory without assuming any 
background in physics. It begins with a novel introduction to the 
quantum theory from a computer-science perspective. It illustrates 
the quantum-computational approach with several elementary 
examples of quantum speed-up, before moving to the major 
applications: Shor’s factoring algorithm, Grover’s search algorithm, 
and quantum error correction. 

The book is intended primarily for computer scientists who know 
nothing about quantum theory but would like to learn the elements of 
quantum computation either out of curiosity about this new 
paradigm, or as a basis for further work in the subject. It will also be 
of interest to physicists who want to learn the theory of quantum 
computation, and to physicists and philosophers of science interested 
in quantum foundational issues. It evolved during six years of teaching 
the subject to undergraduates and graduate students in computer 
science, mathematics, engineering, and physics, at Cornell University. 

N. David Mermin is Horace White Professor of Physics Emeritus at 
Cornell University. He has received the Lilienfeld Prize of the 
American Physical Society and the Klopsteg award of the American 
Association of Physics Teachers. He is a member of the U.S. National 
Academy of Sciences and the American Academy of Arts and 
Sciences. Professor Mermin has written on quantum foundational 
issues for several decades, and is known for the clarity and wit of his 
scientific writings. Among his other books are Solid State Physics 
(with N. W. Ashcroft, Thomson Learning 1976), Boojums all the Way 
Through (Cambridge University Press 1990), and It's about Time: 
Understanding Einstein s Relativity (Princeton University Press 2005). 



“This is one of the finest books in the rapidly growing field of quan¬ 
tum information. Almost every page contains a unique insight or a 
novel interpretation. David Mermin has once again demonstrated his 
legendary pedagogical skills to produce a classic.” 

Lov Grover ; Bell Labs 

“Mermin’s book will be a standard for instruction and reference for 
years to come. He has carefully selected, from the mountain of knowl¬ 
edge accumulated in the last 20 years of research in quantum infor¬ 
mation theory, a manageable, coherent subset that constitutes a com¬ 
plete undergraduate course. While selective, it is in no sense “watered 
down”; Mermin moves unflinchingly through difficult arguments in 
the Shor algorithm, and in quantum error correction theory, providing 
invaluable diagrams, clear arguments, and, when necessary, extensive 
appendices to get the students successfully through to the end. The 
book is suffused with Mermin’s unique knowledge of the history of 
modern physics, and has some of the most captivating writing to be 
found in a college textbook.” 

David DiVincenzo, IBM T.J. Watson Research Center 

“Mermin’s book is a gentle introduction to quantum computation espe¬ 
cially aimed at an audience of computer scientists and mathematicians. 
It covers the basics of the field, explaining the material clearly and con¬ 
taining lots of examples. Mermin has always been an entertaining and 
comprehensible writer, and continues to be in this book. I expect it to 
become the definitive introduction to this material for non-physicists.” 

Peter Shor ; Massachusetts Institute of Technology 

“Textbook writers usually strive for a streamlined exposition, smooth¬ 
ing out the infelicities of thought and notation that plague any field’s 
early development. Fortunately, David Mermin is too passionate and 
acute an observer of the cultural side of science to fall into this bland¬ 
ness. Instead of omitting infelicities, he explains and condemns them, 
at the same time using his experience of having taught the course many 
times to nip nascent misunderstandings in the bud. He celebrates the 
field’s mongrel origin in a shotgun wedding between classical com¬ 
puter scientists, who thought they knew the laws of information, and 
quantum physicists, who thought information was not their job. Dif¬ 
ferences remain: we hear, for example, why physicists love the Dirac 
notation and mathematicians hate it. Worked-out examples and exer¬ 
cises familiarize students with the necessary algebraic manipulations, 
while Mermin’s lucid prose and gentle humor cajole them toward a 
sound intuition for what it all means, not an easy task for a subject 
superficially so counterintuitive.” 

Charles Bennett , IBM T. J. Watson Research Center 
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Preface 


It was almost three quarters of a century after the discovery of quan¬ 
tum mechanics, and half a century after the birth of information theory 
and the arrival of large-scale digital computation, that people finally 
realized that quantum physics profoundly alters the character of infor¬ 
mation processing and digital computation. For physicists this devel¬ 
opment offers an exquisitely different way of using and thinking about 
the quantum theory. For computer scientists it presents a surprising 
demonstration that the abstract structure of computation cannot be 
divorced from the physics governing the instrument that performs 
the computation. Quantum mechanics provides new computational 
paradigms that had not been imagined prior to the 1980s and whose 
power was not fully appreciated until the mid 1990s. 

In writing this introduction to quantum computer science I have 
kept in mind readers from several disciplines. Primarily I am address¬ 
ing computer scientists, electrical engineers, or mathematicians who 
may know little or nothing about quantum physics (or any other kind 
of physics) but who wish to acquire enough facility in the subject to be 
able to follow the new developments in quantum computation, judge for 
themselves how revolutionary they may be, and perhaps choose to par¬ 
ticipate in the further development of quantum computer science. Not 
the least of the surprising things about quantum computation is that 
remarkably little background in quantum mechanics has to be acquired 
to understand and work with its applications to information process¬ 
ing. Familiarity with a few fundamental facts about finite-dimensional 
vector spaces over the complex numbers (summarized and reviewed in 
Appendix A) is the only real prerequisite. 

One of the secondary readerships I have in mind consists of physi¬ 
cists who, like myself - I am a theorist who has worked in statistical 
physics, solid-state physics, low-temperature physics, and mathemat¬ 
ical physics - know very little about computer science, but would like 
to learn about this extraordinary new application of their discipline. 
I stress, however, that my subject is quantum computer science, not 
quantum computer design. This is a book about quantum computa¬ 
tional software - not hardware. The difficult question of how one might 
actually build a quantum computer is beyond its scope. 
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Another secondary readership is made up of those philosophers and 
physicists who - again like myself - are puzzled by so-called founda¬ 
tional issues: what the strange quantum formalism implies about the 
nature of the world that it so accurately describes. By applying quan¬ 
tum mechanics in an entirely new way - and especially by applying it 
to the processing of knowledge - quantum computation gives a new 
perspective on interpretational questions. While I rarely address such 
matters explicitly, for purely pedagogical reasons my presentation is 
suffused with a perspective on the quantum theory that is very close to 
the venerable but recently much reviled Copenhagen interpretation. 
Those with a taste for such things may be startled to see how well 
quantum computation resonates with the Copenhagen point of view. 
Indeed, it had been my plan to call this book Copenhagen Computa¬ 
tion until the excellent people at Cambridge University Press and my 
computer-scientist friends persuaded me that virtually no members of 
my primary readership would then have had any idea what it was about. 

Several years ago I mentioned to a very distinguished theoretical physi¬ 
cist that I spent the first four lectures of a course in quantum computa¬ 
tion giving an introduction to quantum mechanics for mathematically 
literate people who knew nothing about quantum mechanics, and quite 
possibly little if anything about physics. His immediate response was 
that any application of quantum mechanics that can be taught after only 
a four-hour introduction to the subject cannot have serious intellectual 
content. After all, he remarked, it takes any physicist many years to 
develop a feeling for quantum mechanics. 

It’s a good point. Nevertheless computer scientists and mathemati¬ 
cians with no background in physics have been able quickly to learn 
enough quantum mechanics to understand and make major contri¬ 
butions to the theory of quantum computation. There are two main 
reasons for this. 

First of all, a quantum computer - or, more accurately, the abstract 
quantum computer that one hopes someday to be able to embody in ac¬ 
tual hardware - is an extremely simple example of a physical system. It 
is discrete, not continuous. It is made up out of a finite number of units, 
each of which is the simplest possible kind of quantum-mechanical sys¬ 
tem, a so-called two-state system, whose behavior, as we shall see, is 
highly constrained and easily specified. Much of the analytical com¬ 
plexity of learning quantum mechanics is connected with mastering 
the description of continuous (infinite-state) systems. By restricting 
attention to collections of two-state systems (or even i-state systems 
for finite d) one can avoid much suffering. Of course one also loses 
much wisdom, but hardly any of it - at least at this stage of the art - is 
relevant to the basic theory of quantum computation. 

Second, and just as important, the most difficult part of learning 
quantum mechanics is to get a good feeling for how the formalism 
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can be applied to actual phenomena. This almost invariably involves 
formulating oversimplified abstract models of real physical systems, to 
which the quantum formalism can then be applied. The best physicists 
have an extraordinary intuition for what features of the phenomena 
are essential and must be represented in a model, and what features 
are inessential and can be ignored. It takes years to develop such intu¬ 
ition. Some never do. The theory of quantum computation, however, 
is entirely concerned with an abstract model - the easy part of the 
problem. 

To understand how to build a quantum computer, or even to study 
what physical systems are promising candidates for realizing such a 
device, you must indeed have many years of experience in quantum 
mechanics and its applications under your belt. But if you only want to 
know what such a device is capable in principle of doing once you have it, 
then there is no reason to get involved in the really difficult physics of the 
subject. Exactly the same thing holds for ordinary classical computers. 
One can be a masterful practitioner of computer science without having 
the foggiest notion of what a transistor is, not to mention how it works. 

So while you should be warned that the subset of quantum mechanics 
you will acquire from this book is extremely focused and quite limited 
in its scope, you can also rest assured that it is neither oversimplified nor 
incomplete, when applied to the special task for which it is intended. 

I might note that a third impediment to developing a good intuition 
for quantum physics is that in some ways the behavior implied by 
quantum mechanics is highly counterintuitive, if not downright weird. 
Glimpses of such strange behavior sometimes show up at the level 
of quantum computation. Indeed, for me one of the major appeals of 
quantum computation is that it affords a new conceptual arena for 
trying to come to a better understanding of quantum weirdness. When 
opportunities arise I will call attention to some of this strange behavior, 
rather than (as I easily could) letting it pass by unremarked upon and 
unnoticed. 

The book evolved as notes for a course of 28 one-hour lectures on quan¬ 
tum computation that I gave six times between 2000 and 2006 to a di¬ 
verse group of Cornell University undergraduates, graduate students, 
and faculty, in computer science, electrical engineering, mathematics, 
physics, and applied physics. With so broad an audience, little com¬ 
mon knowledge could be assumed. My lecture notes, as well as my own 
understanding of the subject, repeatedly benefited from comments 
and questions in and after class, coming from a number of different 
perspectives. What made sense to one of my constituencies was often 
puzzling, absurd, or irritatingly simple-minded to others. This final 
form of my notes bears little resemblance to my earliest versions, hav¬ 
ing been improved by insightful remarks, suggestions, and complaints 
about everything from notation to number theory. 
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In addition to the 200 or so students who passed through P481-P681- 
CS483, I owe thanks to many others. Albert J. Sievers, then Director 
of Cornell’s Laboratory of Atomic and Solid State Physics, started 
me thinking hard about quantum computation by asking me to put 
together a two-week set of introductory lectures for members of our 
laboratory, in the Fall of 1999. So many people showed up from all over 
the university that I decided it might be worth expanding this sur¬ 
vey into a full course. I’m grateful to two Physics Department chairs, 
Peter Lepage and Saul Teukolsky, for letting me continue teaching 
that course for six straight years, and to the Computer Science De¬ 
partment chair, Charlie van Loan, for support, encouragement, and 
a steady stream of wonderful students. John Preskill, though he may 
not know it, taught me much of the subject from his superb online 
Caltech lecture notes. Charles Bennett first told me about quantum 
information processing, back when the term might not even have been 
coined, and he has always been available as a source of wisdom and clar¬ 
ification. Gilles Brassard has on many occasions supplied me with help 
from the computer-science side. Chris Fuchs has been an indispens¬ 
able quantum-foundational critic and consultant. Bob Constable made 
me, initially against my will, a certified Cornell Information Scientist 
and introduced me to many members of that excellent community. 
But most of all, I owe thanks to David DiVincenzo, who collaborated 
with me on the 1999 two-week LASSP Autumn School and has acted 
repeatedly over the following years as a sanity check on my ideas, an 
indispensable source of references and historical information, a patient 
teacher, and an encouraging friend. 



A note on references 


Quantum Computer Science is a pedagogical introduction to the basic 
structure and procedures of the subject - a quantum-computational 
primer. It is not a historical survey of the development of the field. 
Many of these procedures are named after the people who first put 
them forth, but although I use their names, I do not cite the original 
papers unless they add something to my own exposition. This is be¬ 
cause, not surprisingly, work done since the earliest papers has led to 
clearer expositions of those ideas. I learned the subject myself almost 
exclusively from secondary, tertiary, or even higher-order sources, and 
then reformulated it repeatedly in the course of teaching it for six years. 

On the few occasions when I do cite a paper it is either because 
it completes an exposition that I have only sketched, or because the 
work has not yet become identified in the field with the name(s) of the 
author(s) and I wanted to make clear that it was not original with me. 

Readers interested in hunting down earlier work in the field can 
begin (and in most cases conclude) their search at the quantum-physics 
subdivision of the Cornell (formerly Los Alamos) E-print Archive, 
http : / /arxiv. org/archive/quant-ph, where most of the 
important papers in the field have been and are still being posted. 



Chapter 1 

Cbits and Qbits 


1.1 What is a quantum computer? 

It is tempting to say that a quantum computer is one whose operation 
is governed by the laws of quantum mechanics. But since the laws of 
quantum mechanics govern the behavior of all physical phenomena, 
this temptation must be resisted. Your laptop operates under the laws 
of quantum mechanics, but it is not a quantum computer. A quantum 
computer is one whose operation exploits certain very special transfor¬ 
mations of its internal state, whose description is the primary subject of 
this book. The laws of quantum mechanics allow these peculiar trans¬ 
formations to take place under very carefully controlled conditions. 

In a quantum computer the physical systems that encode the indi¬ 
vidual logical bits must have no physical interactions whatever that are 
not under the complete control of the program. All other interactions, 
however irrelevant they might be in an ordinary computer - which 
we shall call classical - introduce potentially catastrophic disruptions 
into the operation of a quantum computer. Such damaging encoun¬ 
ters can include interactions with the external environment, such as 
air molecules bouncing off the physical systems that represent bits, or 
the absorption of minute amounts of ambient radiant thermal energy. 
There can even be disruptive interactions between the computation¬ 
ally relevant features of the physical systems that represent bits and 
other features of those same systems that are associated with computa¬ 
tionally irrelevant aspects of their internal structure. Such destructive 
interactions, between what matters for the computation and what does 
not, result in decoherence, which is fatal to a quantum computation. 

To avoid decoherence individual bits cannot in general be encoded 
in physical systems of macroscopic size, because such systems (except 
under very special circumstances) cannot be isolated from their own 
irrelevant internal properties. Such isolation can be achieved if the bits 
are encoded in a small number of states of a system of atomic size, where 
extra internal features do not matter, either because they do not exist, or 
because they require unavailably high energies to come into play. Such 
atomic-scale systems must also be decoupled from their surroundings 
except for the completely controlled interactions that are associated 
with the computational process itself. 
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Two things keep the situation from being hopeless. First, because 
the separation between the discrete energy levels of a system on the 
atomic scale can be enormously larger than the separation between the 
levels of a large system, the dynamical isolation of an atomic system 
is easier to achieve. It can take a substantial kick to knock an atom 
out of its state of lowest energy. The second reason for hope is the 
discovery that errors induced by extraneous interactions can actually 
be corrected if they occur at a sufficiently low rate. While error cor¬ 
rection is routine for bits represented by classical systems, quantum 
error correction is constrained by the formidable requirement that it 
be done without knowing either the original or the corrupted state of 
the physical systems that represent the bits. Remarkably, this turns out 
to be possible. 

Although the situation is therefore not hopeless, the practical diffi¬ 
culties in the way of achieving useful quantum computation are enor¬ 
mous. Only a rash person would declare that there will be no useful 
quantum computers by the year 2050, but only a rash person would 
predict that there will be. Never mind. Whether or not it will ever 
become a practical technology, there is a beauty to the theory of quan¬ 
tum computation that gives it a powerful appeal as a lovely branch of 
mathematics, and as a strange generalization of the paradigm of clas¬ 
sical computer science, which had completely escaped the attention of 
computer scientists until the 1980s. The new paradigm demonstrates 
that the theory of computation can depend profoundly on the physics 
of the devices that carry it out. Quantum computation is also a valuable 
source of examples that illustrate and illuminate, in novel ways, the 
mysterious phenomena that quantum behavior can give rise to. 

For computer scientists the most striking thing about quantum com¬ 
putation is that a quantum computer can be vastly more efficient than 
anything ever imagined in the classical theory of computational com¬ 
plexity, for certain computational tasks of considerable practical inter¬ 
est. The time it takes the quantum computer to accomplish such tasks 
scales up much more slowly with the size of the input than it does in 
any classical computer. Much of this book is devoted to examining the 
most celebrated examples of this speed-up. 

This exposition of quantum computation begins with an introduc¬ 
tion to quantum mechanics, specially tailored for this particular ap¬ 
plication. The quantum-mechanics lessons are designed to give you, 
as efficiently as possible, the conceptual tools needed to delve into 
quantum computation. This is done by restating the rules of quantum 
mechanics, not as the remarkable revision of classical Newtonian me¬ 
chanics required to account for the behavior of matter at the atomic 
and subatomic levels, but as a curious generalization of rules describ¬ 
ing an ordinary classical digital computer. By focusing exclusively on 
how quantum mechanics enlarges the possibilities for the physical ma¬ 
nipulation of digital information, it is possible to characterize how 
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the quantum theory works in an elementary and quite concise way, 
which is nevertheless rigorous and complete for this special area of 
application. 

While I assume no prior familiarity with quantum physics (or any 
other kind of physics), I do assume familiarity with elementary linear 
algebra and, in particular, with the theory of finite-dimensional vector 
spaces over the complex numbers. Appendix A summarizes the relevant 
linear algebra. It is worth examining even if you are well acquainted 
with the mathematics of such vector spaces, since it also provides a 
compact summary of the mathematically unconventional language - 
Dirac notation - in which linear algebra is couched in all treatments of 
quantum computation. Dirac notation is also developed, more infor¬ 
mally, throughout the rest of this chapter. 


1.2 Cbits and their states 

We begin with an offbeat formulation of what an ordinary classical 
computer does. I frame the elementary remarks that follow in a lan¬ 
guage which may look artificial and cumbersome, but is designed to 
accommodate the richer variety of things that a computer can do if it 
takes full advantage of the possibilities made available by the quantum- 
mechanical behavior of its constituent parts. By introducing and apply¬ 
ing the unfamiliar nomenclature and notation of quantum mechanics 
in a familiar classical context, I hope to make a little less strange its 
subsequent extension to the broader quantum setting. 

A classical computer operates on strings of zeros and ones, such 
as 110010111011000, converting them into other such strings. Each 
position in such a string is called a bit , and it contains either a 0 or a 
1. To represent such collections of bits the computer must contain a 
corresponding collection of physical systems, each of which can exist 
in two unambiguously distinguishable physical states, associated with 
the value (0 or 1) of the abstract bit that the physical system represents. 
Such a physical system could be, for example, a switch that could be 
open (0) or shut (1), or a magnet whose magnetization could be oriented 
in two different directions, “up” (0) or “down” (1). 

It is a common practice in quantum computer science to use the 
same term “bit” to describe the two-state classical system that rep¬ 
resents the value of the abstract bit. But this use of a single term to 
characterize both the abstract bit (0 or 1) and the physical system whose 
two states represent the two values is a potential source of confusion. 
To avoid such confusion, I shall use the term Chit (“C” for “classi¬ 
cal”) to describe the two-state classical physical system and Qbit to 
describe its quantum generalization. This terminology is inspired by 
Paul Dirac’s early use of c-number and q-number to describe classical 
quantities and their quantum-mechanical generalizations. “Cbit” and 
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“Qbit” are preferable to “c-bit” and “q-bit” because the terms them¬ 
selves often appear in hyphenated constructions. 

Unfortunately the preposterous spelling qubit currently holds sway 
for the quantum system. The term qubit was invented and first used 
in print by the otherwise admirable Benjamin Schumacher. 1 A brief 
history of the term can be found in the acknowledgments at the end of 
his paper. Although “qubit” honors the English (German, Italian,...) 
rule that q should be followed by zz, it ignores the equally powerful 
requirement that qu should be followed by a vowel. My guess is that 
“qubit” has gained acceptance because it visually resembles an obsolete 
English unit of distance, the homonymic cubit. To see its ungainliness 
with fresh eyes, it suffices to imagine that Dirac had written qunumber 
instead of q-number, or that one erased transparencies and cleaned one’s 
ears with Qutips. 

Because clear distinctions among bits, Cbits, and Qbits are crucial 
in the introduction to quantum computation that follows, I shall use 
this currently unfashionable terminology. If you are already addicted 
to the term qubit , please regard Qbit as a convenient abbreviation. 

To prepare for the extension from Cbits to Qbits, I introduce what 
may well strike you as a degree of notational overkill in the discussion 
of Cbits that follows. We shall represent the state of each Cbit as a kind 
of box, depicted by the symbol | ), into which we place the value, 0 
or 1, represented by that state. Thus the two distinguishable states of 
a Cbit are represented by the symbols |0) and |1). It is the common 
practice to call the symbol |0) or 11) itself the state of the Cbit, thereby 
using the same term to refer to both the physical condition of the 
Cbit and the abstract symbol that represents that physical condition. 
There is nothing unusual in this. For example one commonly uses the 
term “position” to refer to the symbol x that represents the physical 
position of an object. I call this common, if little noted, practice to your 
attention only because in the quantum case “state” refers only to the 
symbol, there being no internal property of the Qbit that the symbol 
represents. The subtle relation between Qbits and their state symbol 
will emerge later in this chapter. 

Along the same lines, we shall characterize the states of the five Cbits 
representing 11001, for example, by the symbol 

| 1 )| 1 )| 0 >| 0 >| 1 >, ( 1 . 1 ) 

and refer to this object as the state of all five Cbits. Thus a pair of Cbits 
can have (or “be in”) any of the four possible states 

| 0 >| 0 ), | 0 )| 1 >, | 1 )| 0 ), | 1 )| 1 >, ( 1 . 2 ) 


1 Benjamin Schumacher, “Quantum coding,” Physical Review A 51, 
2738-2747 (1995). 
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three Chits can be in any of the eight possible states 


| 0 )| 0 )| 0 ), | 0 )| 0 )| 1 ), | 0 )| 1 )| 0 ), | 0 )| 1 )| 1 ), | 1 >| 0 )| 0 ), 

11)|0)11), 11)11)|0), |1)|1)|1), (1.3) 


and so on. 

As (1.4) already makes evident, when there are many Chits such 
products are often much easier to read if one encloses the whole string 
of zeros and ones in a single bigger box of the form | ) rather than 

having a separate box for each Chit: 

|000), |001>, |010), |011), |100), |101), |110), |111). (1.4) 

We shall freely move between these two equivalent ways of expressing 
the state of several Chits that represent a string of bits, boxing the whole 
string or boxing each individual bit. Whether the form (1.3) or (1.4) is 
to be preferred depends on the context. 

There is also a third form, which is useful when we regard the zeros 
and ones as constituting the binary expansion of an integer. We can 
then replace the representations of the 3-Cbit states in (1.4) by the 
even shorter forms 

|0), |1), |2), |3), |4), |5), |6), |7). (1.5) 

Note that, unlike the forms (1.3) and (1.4), the form (1.5) is ambiguous, 
unless we are told that these symbols express states of three Chits. If 
we are not told, then there is no way of telling, for example, whether 
13) represents the 2-Cbit state 111), the 3-Cbit state 1011), or the 4-Cbit 
state |0011), etc. This ambiguity can be removed, when necessary, by 
adding a subscript making the number of Chits explicit: 

|0) 3 , | 1 > 3 , 12 ) 3 , 13 ) 3 , |4)s, |5) 3 , |6) 3 , |7) 3 . (1.6) 

Be warned, however, that, when there is no need to emphasize how 
many Chits \x) represents, it can be useful to use such subscripts for 
other purposes. If, for example, Alice and Bob each possess a single 
Chit it can be convenient to describe the state of Alice’s Chit (if it has 
the value 1) by |1)^, Bob’s (if it has the value 0) by |0)/,, and the joint 
state of the two by |1)J0)/, or |10)^. 

Dirac introduced the | ) notation (known as Dirac notation) in the 
early days of the quantum theory, as a useful way to write and manipu¬ 
late vectors. For silly reasons he called such vectors kets , a terminology 
that has survived to this day. In Dirac notation you can put into the box 
) anything that serves to specify what the vector is. If, for example, we 
were talking about displacement vectors in ordinary three-dimensional 
space, we could have a vector 

15 horizontal centimeters northeast). (1.7) 
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In using Dirac notation to express the state of a Cbit, or a collection 
of Chits, I’m suggesting that there might be some utility in thinking 
of the states as vectors. Is there? Well, in the case of Chits, not very 
much, but maybe a little. We now explore this way of thinking about 
Cbit states, because when we come to the generalization to Qbits, it 
becomes absolutely essential to consider them to be vectors - so much 
so that the term state is often taken to be synonymous with vector (or, 
more precisely, “vector that represents the state”). 

We shall briefly explore what one can do with Chits when one takes 
the two states |0) and |1) of a single Cbit to be represented by two 
orthogonal unit vectors in a two-dimensional space. While this is little 
more than a curious and unnecessarily elaborate way of describing 
Chits, it is fundamental and unavoidable in dealing with Qbits. Playing 
unfamiliar and somewhat silly games with Chits will enable you to 
become acquainted with much of the quantum-mechanical formalism 
in a familiar setting. 

If you prefer your vectors to be expressed in terms of components, 
note that we can represent the two orthogonal states of a single Cbit, 
|0) and 11), as column vectors 

|0> = (o)’ |1) = (i)- (L8) 

In the case of two Chits the vector space is four-dimensional, with 
an orthonormal basis 


100), |01), 110), |11). (1.9) 

The alternative notation for this basis, 

| 0 )| 0 ), | 0 )| 1 ), 11 )| 0 ), | 1 )| 1 ), ( 1 . 10 ) 

is deliberately designed to suggest multiplication, since it is, in fact, 
a short-hand notation for the tensor product of the two single-Cbit 
2-vectors, written in more formal mathematical notation as 

| 0 )®| 0 ), | 0 )®| 1 ), | 1 )®| 0 ), | 1 )®| 1 ). ( 1 . 11 ) 

In terms of components, the tensor product a ® b of an Tf-component 
vector a with components a M and an iV-component vector b with com¬ 
ponents b v is the (TfA^-component vector with components indexed 
by all the MN possible pairs of indices (/z, v), whose (/z, v)th com¬ 
ponent is just the product a^b v . A broader view can be found in the 
extended review of vector-space concepts in Appendix A. I shall freely 
move back and forth between the various ways (1.9)—(1.11) of writing 
the tensor product and their generalizations to multi-Cbit states, using 
in each case a form that makes the content clearest. 

Once one agrees to regard the two 1-Cbit states as orthogonal unit 
vectors, the tensor product is indeed the natural way to represent 
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multi-Cbit states, since it leads to the obvious multi-Cbit generaliza¬ 
tion of the representation (1.8) of 1-Cbit states as column vectors. If we 
express the states |0) and 11) of each single Cbit as column vectors, then 
we can get the column vector describing a multi-Cbit state by repeat¬ 
edly applying the rule for the components of the tensor product of two 
vectors. The result is illustrated here for a three-fold tensor product: 


v 0 

X\ 



/ xoyozo \ 

xoyozi 

xoyizo 

x 0 yizi 

xiyozo 

x\yoZ\ 

x\y\Zo 
' xiyizi / 


On applying this, for example, to the case 1 5 ) 3 , we have 


| 5>3 = | 101 > = | 1 )| 0 >| 1 > = (?) 



( 1 . 12 ) 


(1.13) 


If we label the vertical components of the 8-vector on the right 
0, 1, ..., 7, from the top down, then the single nonzero component is 
the 1 in position 5 - precisely the position specified by the state vector 
in its form on the left of (1.13). This is indeed the obvious multi-Cbit 
generalization of the column-vector form (1.8) for 1-Cbit states. 

This is quite general: the tensor-product structure of multi-Cbit 
states is just what one needs in order for the 2”-dimensional column 
vector representing the state | m) n to have all its entries zero except for 
a single 1 in the m th position down from the top. 

One can turn this development upside down, taking as one’s starting 
point the simple rule that an integer x in the range 0 < x < N is 
represented by one of TV orthonormal vectors in an TV-dimensional 
space. One can then pick a basis so that 0 is represented by an N- 
component column vector |0) that has 0 in every position except for a 
1 in the top position, and x is to be represented by an TV-component 
column vector \x) that has 0 in every position except for a 1 in the 
position v down from the top. It then follows from the nature of the 
tensor product that if TV = 2 n and x has the binary expansion x = 
x j2 J , then the column vector \x) n is the tensor product of the n 
2-component column vectors | Xj): 



X n -\) ® \x n -2) ® • • • ® \X\) ® |Vq). 


(1.14) 
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In dealing with n- Cbit states of the form (1.14) we shall identify each 
of the n 1-Cbit states, out of which they are composed, by giving the 
power of 2 associated with the individual bit that the Cbit represents. 
Thus the 1-Cbit state on the extreme right of (1.14) represents Cbit 0, 
the state immediately to its left represents Cbit 1, and so on. 

This relation between tensor products of vectors and positional 
notation for integers is not confined to the binary system. Suppose, 
for example, one represents a decimal digit x = 0, 1, ..., 9 as a 10- 
component column vector v ( ' v - ) with all components 0 except for a 
1, a 1 positions down from the top. If the //-digit decimal number 
X = Xl/Io x j^ J is represented by the tensor product V = ® 

v (*«— 2 ) 0 ... 0 v ( 1 ) 0 y(°), then V will be a 1 (E-component column vec¬ 
tor with all components 0 except for a 1, a 1 positions down from the 
top. 

Although the representation of Cbit states by column vectors clearly 
shows why tensor products give a natural description of multi-Cbit 
states, for almost all other purposes it is better and much simpler to 
forget about column vectors and components, and deal directly with 
the state vectors in their abstract forms (1.3)—(1.6). 


1.3 Reversible operations on Cbits 

Quantum computers do an important part of their magic through re¬ 
versible operations, which transform the initial state of the Qbits into 
its final form using only processes whose action can be inverted. There 
is only a single irreversible component to the operation of a quantum 
computer, called measurement , which is the only way to extract useful 
information from the Qbits after their state has acquired its final form. 
Although measurement is a nontrivial and crucial part of any quantum 
computation, in a classical computer the extraction of information from 
the state of the Cbits is so conceptually straightforward that it is not 
viewed as an inherent part of the computational process, though it is, 
of course, a nontrivial concern for those who design digital displays 
or printers. Because the only computationally relevant operations on 
a classical computer that can be extended to operations on a quantum 
computer are reversible, only operations on Cbits that are reversible 
will be of interest to us here. 

In a reversible operation every final state arises from a unique initial 
state. An example of an irreversible operation is ERASE, which forces 
a Cbit into the state |0) regardless of whether its initial state is |0) or 
|1). ERASE is irreversible in the sense that, given only the final state 
and the fact that it was the output of the operation ERASE, there is no 
way to recover the initial state. 

The only nontrivial reversible operation we can apply to a single Cbit 
is the NOT operation, denoted by the symbol X, which interchanges 
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the two states |0) and |1): 



(1.15) 


This is sometimes referred to as flipping the Chit. NOT is reversible 
because it has an inverse: applying X a second time brings the state of 
the Chit back to its original form: 

X 2 = 1, (1.16) 


where 1 is the unit (identity) operator. If we represent the two or¬ 
thogonal states of the Chit by the column vectors (1.8), then we can 
express NOT by a linear operator X on the two-dimensional vector 
space, whose action on the column vectors is given by the matrix 



(1.17) 


So the two reversible things you can do to a single Chit - leaving it 
alone and flipping it - correspond to the two linear operators X and 1, 



(1.18) 


on its two-dimensional vector space. 

A pedantic digression: since multiplication by the scalar 1 and ac¬ 
tion by the unit operator 1 achieve the same result, I shall sometimes 
follow the possibly irritating practice of physicists and not distinguish 
notationally between them. I shall take similar liberties with the scalar 
0, the zero vector 0 , and the zero operator 0 . 

Possibilities for reversible operations get richer when we go from a 
single Chit to a pair of Chits. The most general reversible operation on 
two Chits is any permutation of their four possible states. There are 4! 
= 24 such operations. Perhaps the simplest nontrivial example is the 
swap (or exchange) operator S /7 , which simply interchanges the states 
of Chits i and j : 


Siokj) = \yx). 


(1.19) 


Since the swap operator Sio interchanges 1 01 ) = 1 1)2 and 1 10 ) = |2)2, 
while leaving 1 00) = 1 0)2 and 111) = 1 3)2 fixed, its matrix in the basis 

|0>2, |1>2, |2>2, |3>2 is 




0 

0 

0 \ 

0 

0 

1 

0 

0 

1 

0 

0 ' 

\0 

0 

0 

1 / 


( 1 . 20 ) 


The 2-Cbit operator whose extension to Qbits plays by far the 
most important role in quantum computation is the controlled-NOT 
or cNOT operator C/ ; . If the state of the zth Chit (the control Chit ) is 
|0), Q j leaves the state of the j th Chit (the target Chit ) unchanged, but, 
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if the state of the control Chit is 11), C tJ applies the NOT operator X 
to the state of the target Chit. In either case the state of the control Chit 
is left unchanged. 

We can summarize this compactly by writing 

Ciok)b> = k)b © x), Coi|x)|j/} = \x © y)\y), (1.21) 

where © denotes addition modulo 2: 


y ® 0 = y, y®l=y = l-y. (1.22) 

The modulo-2 sum x © y is also called the “exclusive OR” (or XOR) 
of v and y. 

You can construct SWAP out of three cNOT operations: 



c c c 

'-'2 J '- / l v -*2 J 


(1.23) 


This can easily be verified by repeated applications of (1.21), noting 
that x © v = 0. We note some other ways of showing it below. 

To construct the matrix for the cNOT operation in the four¬ 
dimensional 2-Cbit space, note that if the control Chit is on the left 
then cNOT leaves 100) = |0)2 and 101) = 11)2 fixed and exchanges 
110) = 1 2)2 and 111) = 1 3 ) 2 - Therefore the 4 © 4 matrix representing 
C 10 is just 


/I 0 0 0 \ 
0 10 0 
0 0 0 1 
( 0010 / 


(1.24) 


If the control Chit is on the right, then the states 101) = 1 1)2 and 
111) = 13 )2 are interchanged, and 100) = | 0)2 and 110) = 1 2)2 are fixed, 
so the matrix representing Cqi is 


/I 0 0 0 \ 
0 0 0 1 
0 0 10 
\0 1 0 0 / 


(1.25) 


The construction (1.23) of S out of cNOT operators also follows 
from (1.20), (1.24), and (1.25), using matrix multiplication. As a prac¬ 
tical matter, it is almost always more efficient to establish operator 
identities by dealing with them directly as operators, avoiding matrix 
representations. 

A very common kind of 2-Cbit operator consists of the tensor prod¬ 
uct © of two 1-Cbit operators: 


(a © b)|vy) = (a © b)\x) © \y) = a\x) © b| y), (1-26) 

from which it follows that 


(a © b)(c © d) = (ac) © (bd). 


(1.27) 
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This tensor-product notation for operators can become quite un¬ 
gainly when one is dealing with a large number of Chits and wants to 
write a 2-Cbit operator that affects only a particular pair of Chits. If, 
for example, the 2-Cbit operator in (1.26) acts only on the second and 
fourth Chits from the right in a 6-Cbit state, then the operator on the 
6 -Cbit state has to be written as 

l®l®a®l®b®l. (1-28) 

To avoid such typographical monstrosities, we simplify (1.28) to 

l®l®a®l®b®l = a 3 bi = bia 3 , (1-29) 

where the subscript indicates which Chit the 1-Cbit operator acts on, 
and it is understood that those Chit states whose subscripts do not 
appear remain unmodified - i.e. they are acted on by the unit operator. 
As noted above, we label each 1-Cbit state by the power of 2 it would 
represent if the n Chits were representing an integer: the state on the 
extreme right is labeled 0, the one to its left, 1, etc. Since the order 
in which a and b are written is clearly immaterial if their subscripts 
specify different 1-Cbit states, the order in which one writes them in 
(1.29) doesn’t matter: 1-Cbit operators that act on different 1-Cbit 
states commute. 

Sometimes we deal with 1-Cbit operators that already have sub¬ 
scripts in their names; under such conditions it is more conve¬ 
nient to indicate which Chit state the operator acts on by a super¬ 
script, enclosed in parentheses to avoid confusion with an exponent: 
thus represents the 1-Cbit operator that flips the third Chit 
state from the right, but X represents the square of the flip oper¬ 
ator (i.e. the unit operator) without reference to which Chit state it 
acts on. 

To prepare for some of the manipulations we will be doing with 
operations on Qbits, we now examine a few examples of working with 
operators on Chits. 


1.4 Manipulating operations on Cbits 

It is useful to introduce a 1-Cbit operator n that is simply the projection 
operator onto the state 11): 

n\x) = x\x), v=0orl. (1.30) 

Because |0) and 11) are eigenvectors of n with eigenvalues 0 and 1, n is 
called the 1-Cbit number operator. We also define the complementary 
operator, 


n = 1 — n, 


(1.31) 
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which projects onto the state 10), so 10) and 11) are eigenvectors of n with 
eigenvalues 1 and 0. These operators have the matrix representations 


n = 


0 0 
0 1 


n = 


1 0 
0 0 


(1.32) 


It follows directly from their definitions that 


n = n. 


n = n 


nn = nn = 0, n + n = 1. (1.33) 


We also have 


nX = Xn, nX = Xn, (1.34) 

since flipping the state of a Chit and then acting on it with n (n) is the 
same as acting on the state with n (n) and then flipping it. All the simple 
relations in (1.33) and (1.34) also follow, as they must, from the matrix 
representations (1.17) and (1.32) for X, n, and n. 

Although n has no interpretation as a physical operation on Chits - 
replacing the state of a Chit by the zero vector corresponds to no physi¬ 
cal operation - it can be useful in deriving relations between operations 
that do have physical meaning. Since, for example, the SWAP operator 
S ij acts as the identity if the states of the Chits i and j are the same, and 
flips the numbers represented by both Chits if their states are different, 
it can be written as 


S ^ = n* n j + n t nj + (X / X / )(n / n / + n 2 n 7 ). (1.35) 

At the risk of belaboring the obvious, I note that (1.35) acts as the 
swap operator because if both Chits are in the state |1) (so swapping 
their states does nothing) then only the first term in the sum acts (i.e. 
each of the other three terms gives 0) and multiplies the state by 1; 
if both Chits are in the state |0), only the second term acts and again 
multiplies the state by 1; if Chit i is in the state 11) and Chit j is in the 
state |0), only the third term acts and the effect of flipping both Chits 
is to swap their states; and if Chit i is in the state |0) and Chit j is in 
the state 11), only the fourth term acts and the effect of the two Xs is 
again to swap their states. 

To help you become more at home with this notation, you are urged 
to prove from (1.35) that S tJ = 1, using only the relations in (1.33) and 

(1.34), the fact that X 2 = 1, and the fact that 1-Cbit operators acting 
on different Chits commute. 

The construction (1.23) of SWAP out of cNOT operators can also 
be demonstrated using a more algebraic approach. Note first that C tJ 
can be expressed in terms of ns and Xs by 

Q j = rf + X j n t , (1.36) 

since if the state of Chit i is |0) only the first term acts, which leaves the 
states of both Chits unchanged, but if the state of Chit i is 11) only the 
second term acts, which leaves the state of Chit i unchanged, while X 7 
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flips Cbit j. If you substitute expressions of the form (1.36) for each 
of the three terms in (1.23), then you can show by purely algebraic 
manipulations that four of the eight terms into which the products 
expand vanish and the remaining four can be rearranged to give the 
swap operator (1.35). 

An operator that has no direct role to play in classical computa¬ 
tion, but which is as important as the NOT operator X in quantum 
computation, is the operator Z defined by 

Z = n-n=(j (1.37) 

It follows from (1.34) (or from the matrix representations (1.17) and 
(1.37)) that X anticommutes with Z: 


ZX = -XZ. (1.38) 

Since n + n = 1, we can use (1.37) to express the 1-Cbit projection 
operators n and n in terms of 1 and Z: 

n = |(1 — Z), n = |(1 + Z). (1.39) 

Using this we can rewrite the cNOT operator (1.36) in terms of X 
and Z operators: 

C ij = \(l + Z i ) + \Xj(l-Zi) 

= \(l + X i ) + \Z,(\-X j ). (1.40) 

The second form follows from the first because X 7 and Z* commute 
when i f j. Note that, if we were to interchange X and Z in the second 
line of (1.40), we would get back the expression directly above it except 
for the interchange of i and j . So interchanging the X and Z operators 
has the effect of switching which Cbit is the control and which is 
the target, changing C tJ into C /z . An operator that can produce just 
this effect is the Hadamard transformation (also sometimes called the 
Walsh-Hadamard transformation ), 

1 1/1 1 \ 

H = 7I (x + z,= 7l(i -i> (14,) 

This is another operator of fundamental importance in quantum 
computation. 2 


2 Physicists should note here an unfortunate clash between the notations of 
quantum computer science and physics. Quantum physicists invariably use 
H to denote the Hamiltonian function (in classical mechanics) or 
Hamiltonian operator (in quantum mechanics). Fortunately Hamiltonian 
operators, although of crucial importance in the design of quantum 
computers, play a very limited role in the general theory of quantum 
computation, being completely overshadowed by the unitary 
transformations that they generate. So physicists can go along with the 
computer-science notation without getting into serious trouble. 
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Since X 2 = Z 2 = 1 and XZ = —ZX, one easily shows from the 
definition (1.41) of H in terms of X and Z that 

H 2 = 1 (1.42) 

and that 

HXH = Z, HZH = X. (1.43) 

This shows how H can be used to interchange the X and Z operators 
in C ji\ it follows from (1.43), together with (1.40) and (1.42), that 

Cji = (H,H,)C,,(H,H ; ). (1.44) 

We shall see that this simple relation can be put to some quite 
remarkable uses in a quantum computer. While one can achieve 
this interchange on a classical computer using the SWAP operation, 
Cji = SijCijSij, the crucial difference between Sjj and l-I^Hy is that 
the latter is a product of two 1-Cbit operators, while the former is not. 

Of course, the action of H on the state of a Chit that follows from 
(1.41), 


H|0> = ^(|0> + |1», H|1> = ^(|0>-|1», (1.45) 

describes no meaningful transformation of Chits. Nevertheless, when 
combined with other operations, as on the right side of (1.44), the 
Hadamard operations result in the perfectly sensible operation given 
on the left side. In a quantum computer the action of H on 1-Qbit 
states turns out to be not only meaningful but also easily implemented, 
and the possibility of interchanging control and target Qbits using only 
1-Qbit operators in the manner shown in (1.44) turns out to have some 
striking consequences. 

The use of Hadamards to interchange the control and target Qbits of 
a cNOT operation is sufficiently important in quantum computation 
to merit a second derivation of (1.44), which further illustrates the 
way in which one uses the operator formalism. In strict analogy to the 
definition of cNOT (see (1.21) and the preceding paragraph) we can 
define a controlled-Z operation, C ^, which leaves the state of the target 
Chit j unchanged if the state of the control Chit i is |0), and operates 
on the target Chit with Z if the state of the control Chit is |1). As a 
result Cf 0 \xy) acts as the identity on \xy) unless both x and y are 1, in 
which case it simply takes 111) into — 111). This behavior is completely 
symmetric in the two Chits, so 

Cfj = c y (1.46) 

It is a straightforward consequence of (1.42) and (1.43) that sand¬ 
wiching the target Chit of a cNOT between Hadamards converts 
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it to a C z : 


H r u rZ Li r Li nZ 

J v_ V/ * V/ — ^ij i 1 *i ^ji i i? — ji • 

In view of (1.46), we then have 

H ; C„ H ; = H,C /( H ! , 

which is equivalent to (1.44), since H = 1. 

As a final exercise in treating operations on Chits as linear operations 
on vectors, we construct an alternative form for the swap operator. If 
we use (1.39) to reexpress each n and n appearing in the swap operator 
(1.35) in terms of Z, we find that 


(1.47) 

(1.48) 


S 0 - = 1(1 + Z ,Zj) + i(X,-X y )(l - Z,Zj). (1.49) 

If we define 

;) (i = V-T). (1-50) 

we get the more compact form 

So = \(1 + XX, + Y,Y j + Z,Zj). (1.51) 



For three quarters of a century physicists have enjoyed grouping the 
matrix representations of the three operators X, Y, and Z into a u 3- 
vector” whose “components” are 2 ® 2 matrices: 



The swap operator then becomes 3 

So = i( 1 + ^ (,) -^ (i T 



(1.52) 

(1.53) 


where represents the ordinary three-dimensional scalar product: 


• -$ {J) = a^cr^ + cr^o-y + cr^cr^. (1.54) 

The three components of have many properties that are un¬ 
changed under cyclic permutations of x,y,andz. All three are 
Hermitian. 4 All square to unity, 


(j 


2 

x 



(1.55) 


3 Physicists might enjoy the simplicity of this “computational” derivation of 
the form of the exchange operator, compared with the conventional 
quantum-mechanical derivation, which invokes the full apparatus of 
angular-momentum theory. 

4 The elements of a Hermitian matrix A satisfy A Jt — A *-, where * denotes 
complex conjugation. A fuller statement in a broader context can be found in 
Appendix A. 





16 


C BITS AND QBITS 


They all anticommute in pairs and the product of any two of them is 
simply related to the third: 

&y — & y&x — 

&z — ^ ^x *> (1.56) 

<7(7 x — — (7 X (T z — l(T y . 

The three relations (1.56) differ only by cyclic permutations of v, j/, 
and z. 

All the relations in (1.55) and (1.56) can be summarized in a single 
compact and useful identity. Let ~ct and b be two 3-vectors with 
components a x , a y , a z and b x ,b y ,b z that are ordinary real numbers. 
(They can also be complex numbers, but in most useful applications 
they are real.) Then one easily confirms that all the relations in (1.55) 
and (1.56) imply and are implied by the single identity 

(~t • ~&)(t '^) = (t-t)l + i(txt)^, (1.57) 

where ~ct x ~t denotes the vector product (or “cross product”) of it 
and b , 

( (t X b )^ CLyb z CL z b y , 

{it xt)y = a z b x - a x b z , (1.58) 

( (t X b t ; CL X b y CL yb X . 

Together with the unit matrix 1, the matrices cr r , cr y , and cr z form 
a basis for the four-dimensional algebra of two-dimensional matrices 
of complex numbers: any such matrix is a unique linear combination of 
these four with complex coefficients. Because the four are all Hermitian, 
any two-dimensional Hermitian matrix A of complex numbers must 
be a real linear combination of the four, and therefore of the form 

A = a 0 l + lt •-&, (1.59) 

where a$ and the components of the 3-vector it are all real numbers. 

The matrices cr x , cr y , and cr z were introduced in the early days of 
quantum mechanics by Wolfgang Pauli, to describe the angular mo¬ 
mentum associated with the spin of an electron. They have many other 
useful purposes, being simply related to the quaternions invented by 
Hamilton to deal efficiently with the composition of three-dimensional 
rotations. 5 It is pleasing to find them here, buried in the interior of the 
operator that simply swaps two classical bits. We shall have extensive 
occasion to use Pauli’s 1-Qbit operators when we come to the subject of 


5 Hamilton’s quaternions /, j, k are represented by icr x , i<r y , icr z . The 
beautiful and useful connection between Pauli matrices and 
three-dimensional rotations discovered by Hamilton is developed in 
Appendix B. 
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quantum error correction. Some of their properties, developed further 
in Appendix B, prove to be quite useful in treating Qbits, to which we 
now turn. 


1.5 Qbits and their states 

The state of a Chit is a pretty miserable specimen of a two-dimensional 
vector. The only vectors with any classical meaning in the whole two- 
dimensional vector space are the two orthonormal vectors |0) and 11), 
since those are the only two states a Chit can have. Happily, nature has 
provided us with physical systems, Qbits, described by states that do 
not suffer from this limitation. The state \i/f) associated with a Qbit 
can be any unit vector in the two-dimensional vector space spanned by 
|0) and 11) over the complex numbers. The general state of a Qbit is 

W>=a 0 |0)+ai|l) = (“M, (1.60) 

where ao and aq are two complex numbers constrained only by the 
requirement that |t/t), like |0) and 11), should be a unit vector in the 
complex vector space - i.e. only by the normalization condition 

|a 0 | 2 +|ai| 2 = 1. (1.61) 


The state \\f/) is said to be a superposition of the states |0) and |1) with 
amplitudes ao and aq. If one of a o and aq is 0 and the other is 1 - i.e. 
the special case in which the state of the Qbit is one of the two classical 
states 10) or 11) - it can be convenient to retain the language appropriate 
to Chits, speaking of the Qbit “having the value” 0 or 1. More correctly, 
however, one is entitled to say only that the state of the Qbit is |0) or 
|1). Qbits, in contrast to Chits, cannot be said to “have values.” They 
have - or, more correctly, are described by , or, better still, are associated 
with - states. We shall often sacrifice correctness for ease of expression. 
Some reasons for this apparently pedantic terminological hair splitting 
will emerge below. 

Just as the general state of a single Qbit is any normalized superpo¬ 
sition (1.60) of the two possible classical states, the general state |*P) 
that nature allows us to associate with two Qbits is any normalized 
superposition of the four orthogonal classical states, 


1^) — cfi)o|00) + aoilOl) + cq 0 |10) + oqi 111) — 


/<*oo\ 
<*01 
<*10 

Van / 


(1.62) 


with the complex amplitudes being constrained only by the normal¬ 
ization condition 


1^00 I 2 + 1 ^ 01 1 2 + lo'iol 2 + Infill 2 — 1. 


(1.63) 
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This generalizes in the obvious way to n Qbits, whose general state can 
be any superposition of the 2 n different classical states, with amplitudes 
whose squared magnitudes sum to unity: 

I'J') = U U x \x) n , (1.64) 

0<x<2 w 

l“*l 2 = 1 - (i.65) 

0<X<2 n 

In the context of quantum computation, the set of 2” classical states - 
all the possible tensor products of n individual Qbit states |0) and 11) - 
is called the computational basis. For most purposes classical basis is a 
more appropriate term. I shall use the two interchangeably The states 
that characterize n Chits - the classical-basis states - are an extremely 
limited subset of the states of n Qbits, which can be any (normalized) 
superposition with complex coefficients of these classical-basis states. 

If we have two Qbits, one in the state | ifr) = c^o 10) + a\ | 1) and the 
other in the state \<p) = /?o|0) + /?i|l), then the state |^) of the pair, 
in a straightforward generalization of the rule for multi-Cbit states, is 
taken to be the tensor product of the individual states, 


l*> = I VO ® I <t>) = («o|0> +a 1 |l>) ® (/?o|0) + All)) 


— aoA)|00) +o'o^i|01) + o'i^o 110) +ofiAIH) 
/OtoPo\ 

_ aoPi 
~ ofiA) 

\aiPi / 


( 1 . 66 ) 


Note that a general 2-Qbit state (1.62) is of the special form (1.66) if 
and only if a'ooaqi = a'oioqo- Since the four amplitudes in (1.62) are 
constrained only by the normalization condition (1.63), this relation 
need not hold, and the general 2-Qbit state, unlike the general state 
of two Chits, is not a product (1.66) of two 1-Qbit states. The same is 
true for states of n Qbits. Unlike Chits, whose general state can only 
be one of the 2” products of |0)s and |l)s, a general state of n Qbits 
is a superposition of these 2 n product states and cannot, in general, 
be expressed as a product of any set of 1-Qbit states. Individual Qbits 
making up a multi-Qbit system, in contrast to individual Chits, cannot 
always be characterized as having individual states of their own. 6 

Such nonproduct states of two or more Qbits are called entangled 
states. The term is a translation of Schrodinger’s verschrdnkt , which I 


6 More precisely, they do not always have what are called pure states of their 
own. It is often convenient to give a statistical description of an individual 
Qbit (or a group of Qbits) in terms of what is called a density matrix or mixed 
state. If one wishes to emphasize that one is not talking about a mixed state, 
one uses the term a p ure state.” In this book the term “state” always means 
“pure state.” 
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am told is rendered more accurately as “entwined” or “enfolded.” But 
Schrodinger himself used the English word “entangled,” and may even 
have used it before coining the German term. When the state of several 
Qbits is entangled, they can sometimes behave in some very strange 
ways. An example of such peculiar behavior is discussed in Appendix 
D. Aside from its intrinsic interest, the appendix provides some further 
exercise in the analytical manipulation of Qbits. 


1.6 Reversible operations on Qbits 

The only nontrivial reversible operation a classical computer can per¬ 
form on a single Chit is the NOT operation X. Nature has been far 
more versatile in what it allows us to do to a Qbit. The reversible 
operations that a quantum computer can perform upon a single Qbit 
are represented by the action on the state of the Qbit of any linear 
transformation that takes unit vectors into unit vectors. Such transfor¬ 
mations u are called unitary and satisfy the condition 7 

uu^ = u^u = 1. (1.67) 

Since any unitary transformation has a unitary inverse, such actions of a 
quantum computer on a Qbit are reversible. The reason why reversibil¬ 
ity is crucial for the effective functioning of a quantum computer will 
emerge in Chapter 2. 

The most general reversible n -Chit operation in a classical com¬ 
puter is a permutation of the (2 W )! different classical-basis states. The 
most general reversible operation that a quantum computer can per¬ 
form upon n Qbits is represented by the action on their state of any 
linear transformation that takes unit vectors into unit vectors - i.e. any 
2”-dimensional unitary transformation U, satisfying 

UU’ = UU = 1. (1.68) 

Any reversible operation on n Chits - i.e. any permutation P of the 
2” Chit states - can be associated with a unitary operation U on n Qbits. 
One defines the action of U on the classical-basis states of the Qbit to 
be identical to the operation of P on the corresponding classical states 
of the Chit. Since the classical basis is a basis, U can be extended to 
arbitrary w-Qbit states by requiring it to be linear. Since the action 
of U on the classical-basis states is to permute them, its effect on any 
superposition of such states ^oi x \x) n is to permute the amplitudes 
ot x . Such a permutation preserves the value of \ot x \ 2 , so U takes 
unit vectors into unit vectors. Being norm-preserving and linear, U is 
indeed unitary. 


7 These and other facts about linear operators on vector spaces over the 
complex numbers are also reviewed and summarized in Appendix A. 
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Many important unitary operations on Qbits that we shall be exam¬ 
ining below are defined in this way, as permutations of the classical-basis 
states, which are implicitly understood to be extended by linearity to 
all Qbit states. In particular, the transformations NOT, SWAP, and 
cNOT on Chits are immediately defined in this way for Qbits as well. 
But the available unitary transformations on Qbits are, of course, much 
more general than straightforward extensions of classical operations. 
We have already encountered two such examples, the operator Z and 
the Hadamard transformation H. Both of these take the classical-basis 
states of a Qbit into another orthonormal basis, so their linear exten¬ 
sions to all Qbit states are necessarily unitary. 

In designing quantum algorithms, the class of allowed unitary trans¬ 
formations is almost always restricted to ones that can be built entirely 
out of products of unitary transformations that act on only one Qbit 
at a time, called 1-Qbit gates, or that act on just a pair of Qbits, called 
2-Qbitgates. This restriction is imposed because the technical problems 
of making higher-order quantum gates are even more formidable than 
the (already difficult) problems of constructing reliable 1- and 2-Qbit 
gates. 

It turns out that this is not a fundamental limitation, since arbitrary 
unitary transformations can be approximated to an arbitrary degree 
of precision by sufficiently many 1- and 2-Qbit gates. We shall not 
prove this general result, 8 because all of the quantum algorithms to 
be developed here will be explicitly built up entirely out of 1- and 
2-Qbit gates. One very important illustration of the sufficiency of 1- 
and 2-Qbit gates will emerge in Chapter 2. For a reversible classical 
computer, it can be shown that at least one 3-Cbit gate is needed to 
build up general logical operations. But, in a quantum computer, we 
shall find, remarkably - and importantly for the feasibility of practical 
quantum computation - that the quantum extension of this 3-Cbit gate 
can be constructed out of a small number of 1- and 2-Qbit gates. 

While unitarity is generally taken to be the hallmark of the transfor¬ 
mations nature allows us to perform on quantum states, what is really 
remarkable about the transformations of Qbit states is their linearity 
(which is, of course, one aspect of their unitarity). It is easy to dream 
up simple classical models for a Qbit, particularly if one restricts its 
states to real linear combinations of the two computational basis states. 
It is not hard to invent classical models for NOT and Hadamard 1- 
Qbit gates that act linearly on all the 1-Qbit states of the model Qbit. 
But I know of no classical model that can extend a cNOT on the 
four computational basis states of two Chits to an operation that acts 


8 The argument is given by David P. DiVincenzo, “Two-bit gates are universal 
for quantum computation,” Physical Review A 51, 1015-1022 (1995), 
http://arxiv.org/abs/quant-ph/9407022. 
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Fig 1.1 


A circuit diagram representing the action on a single Qbit of 
the 1-Qbit gate u. Initially the Qbit is described by the input state | \jr) 
on the left. The thin line (wire) represents the subsequent history of 
the Qbit. After emerging from the box representing u, the Qbit is 
described on the right by the final state 



Fig 1.2 


A circuit diagram representing the action on n Qbits of the 
w-Qbit gate U. Initially the Qbits ares described by the input state |T) 
on the left. The thick line (bar) represents the subsequent history of 
the Qbits. After emerging from the box representing U, the Qbits are 
described on the right by the final state U|T). 


linearly on all the states of two model Qbits. It is a remarkable and 
highly nontrivial fact about the physical world that nature does allow 
us, with much ingenuity and hard work, to fabricate unitary cNOT 
gates for a pair of genuine quantum Qbits. 


1.7 Circuit diagrams 

It is the practice in quantum computer science to represent the action 
of a sequence of gates acting on n Qbits by a circuit diagram. The initial 
state of the Qbits appears on the left, the final state on the right, and the 
gates themselves in the central part of the figure. Figure 1.1 shows the 
simplest possible such diagram: a Qbit initially in the state | \/f) is acted 
on by a 1-Qbit gate u, with the result that the Qbit is assigned the new 
state u1. Figure 1.2 shows the analogous diagram for an n -Qbit gate 
U and an n -Qbit initial state | ^). The line that goes into and out of the 
box representing the unitary transformation - which becomes useful 
when one starts chaining together a sequence of gates - is sometimes 
called a wire in the case of a single Qbit, and the thicker line (which 
represents n wires) associated with an n -Qbit gate is sometimes called 
a bar. 

Figure 1.3 reveals a peculiar feature of these circuit diagrams that it is 
important to be aware of. The diagrams are read from left to right (as one 
reads ordinary prose in European languages). Part (a) portrays a circuit 
that acts first with V and then with U on the initial state | ). The result 

is the state UV|*P), because it is the convention, in writing equations 
for linear operators on vector spaces, that the operation appears to the 
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Fig 1.3 


(a) A circuit 
diagram representing the 
action on n Qbits of two 
w-Qbit gates. Initially the 
Qbits are described by the 
input state |T) on the left. 
They are acted upon first 
by the gate V and then by 
the gate U, emerging on 
the right in the final state 
UV|T). Note that the 
order in which the Qbits 
encounter unitary gates in 
the figure is opposite to the 
order in which the 
corresponding symbols are 
written in the symbol for 
the final state on the right, 
(b) This emphasizes the 
unfortunate convention 
that, because gates on the 
left act before gates on the 
right in a circuit diagram, a 
circuit showing V on the 
left and U on the right 
represents the operation 
conventionally denoted by 
UV. 



V 


u 

— — — 

UV 


(b) 


left of the state on which it acts. Thus the sequence of symbols |\h), 
V,and U on the left of the circuit diagram in (a) is reversed from the 
sequence in which they appear in the mathematical representation of 
the state that is produced on the right. Part (b) shows the consequences 
of this for the part of the circuit diagram containing just the gates: a 
diagram in which a gate V (on the left) is followed by a gate U on the 
right describes the unitary transformation UV. 

One should be wary of the possibility for confusion arising from 
the fact that operators (and states) in circuit diagrams always appear 
in the diagrams in the opposite sequence from the order in which 
they appear on the page in the corresponding equations. While sev¬ 
eral of the most important diagrams we shall encounter are left-right 
symmetric, many are not, so one should be on guard against getting 
things backwards when translating equations into circuit diagrams and 
vice versa. 

In lecturing on quantum computation I tried for several years to 
reverse the computer-science convention, putting the initial state on 
the right of the circuit diagram and letting the gates on the right act 
first. This has the great advantage of making the diagram look like 
the equation it represents. It has, however, a major disadvantage, even 
setting aside the fact that it flies in the face of well established conven¬ 
tion. It requires one to write on the blackboard in the wrong direction, 
from right to left, whenever one wishes to produce a circuit diagram. 
Guessing how far to the right one should start is hard to do if the di¬ 
agram is a lengthy one, and for this reason I gave up after a few years 
and reverted to the conventional form. A better alternative would be 
for physicists to start writing their equations with the states on the left 
(represented by bra vectors rather than ket vectors 9 ) and with linear 
operators appearing to the right of the states on which they act. But 
this would require abandoning a tradition that goes back three quarters 
of a century. So we are stuck with a clash of cultures, and must simply 
keep in mind that confusion can arise if one forgets the elementary fact 
represented in Figure 1.3(b). 

There is little utility to circuit diagrams of the simple form in 
Figures 1.1—1.3, but they are important as building blocks out of which 


9 See Appendix A for the distinction between bras and kets. 
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larger circuit diagrams are constructed. As the number of operations 
increases, the diagrams enable one to see at a glance the action of a 
sequence of 1- and 2-Qbit unitary gates on a collection of many Qbits 
in a way that is far more transparent and much more easily remem¬ 
bered than the corresponding formulae. Indeed, many calculations that 
involve rather lengthy equations can be simply accomplished by ma¬ 
nipulating circuit diagrams, as we shall see. 

When the state vectors entering or leaving a wire or bar in a circuit 
diagram are computational-basis states like |v), one sometimes omits 
the symbol | ) and simply writes x. 

1.8 Measurement gates and the Born rule 

To give the state of a single Chit you need only one bit of information: 
whether the state of the Chit is |0) or |1). But to specify the state 

(1.60) of a single Qbit to an arbitrarily high degree of precision, you 
need arbitrarily many bits of information, since you must specify two 
complex numbers a and /? subject only to the normalization constraint 

(1.61) . Because Qbits not only have a much richer set of states than 
Chits, but also can be acted on by a correspondingly richer set of 
transformations, it might appear obvious that a quantum computer 
would be vastly more powerful than a classical computer. But there is a 
major catch! 

The catch is this: if you have n Chits, each representing either 0 or 
1, you can find out the state of each just by looking. There is nothing 
problematic about learning the state of a Chit, and hence learning the 
result of any calculation you may have built up out of operations on those 
Chits. Furthermore - and this is taken for granted in any discussion of 
a classical computer - the state of Chits is not altered by the process 
of reading them. The act of acquiring the information from Chits is 
not disruptive. You can read the Chits at any stage of a computation 
without messing up subsequent stages. 

In stark contrast, if you have n Qbits in a superposition (1.64) of 
computational basis states, there is nothing whatever you can do to them 
to extract from those Qbits the vast amount of information contained in 
the amplitudes a x . You cannot read out the values of those amplitudes, 
and therefore you cannot find out what the state is. The state of n Qbits 
is not associated with any ascertainable property of those Qbits, as it is 
for Chits. 

There is only one way to extract information from n Qbits in a 
given state. It is called making a measurement. 10 Making a measurement 


10 Physicists will note - others need pay no attention to this remark - that 
what follows is more accurately characterized as “making a (von Neumann) 
measurement in the computational (classical) basis.” There are other ways 
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consists of performing a certain test on each Qbit, the outcome of which 
is either 0 or 1. The particular collection of zeros and ones produced by 
the test is not in general determined by the state |^) of the Qbits; the 
state determines only xhe probability of the possible outcomes, according 
to the following rule: the probability of getting a particular result - say 
01100, if you have five Qbits - is given by the squared magnitude of the 
amplitude of the state |01100) in the expansion of the state |^) of the 
Qbits in the 2 5 computational basis states. More generally, if the state 
of n Qbits is 

\V)n — ^ ^ ®t X \ m (1.69) 

0<x<2 n 

then the probability that the zeros and ones resulting from measure¬ 
ments of all the Qbits will give the binary expansion of the integer x is 

p(x) = \a x \ 2 . (1-70) 

This basic rule for how information can be extracted from a quan¬ 
tum state was first enunciated by Max Born, and is known as the Born 
rule. It provides the link between amplitudes and the numbers you can 
actually read out when you test - i.e. measure - the Qbits. The squared 
magnitudes of the amplitudes give the probabilities of outcomes of 
measurements. Normalization conditions like (1.65) are just the re¬ 
quirements that the probabilities for all of the 2 n mutually exclusive 
outcomes add up to 1. 

The process of measurement is carried out by a piece of hardware 
with a digital display, known as an //-Qbit measurement gate. Such an 
//-Qbit measurement gate is depicted schematically in Figure 1.4. In 
contrast to unitary gates, which have a unique output state for each 
input state, the state of the Qbits emerging from a measurement gate is 
only statistically determined by the state of the input Qbits. In further 
contrast to unitary gates, the action of a measurement gate cannot be 
undone: given the final state |v), there is no way of reconstructing the 
initial state |^k). Measurement is irreversible. Nor is the action of a 
measurement gate in any sense linear. 

To the extent that it suggests that some preexisting property is being 
revealed, “measurement” is a dangerously misleading term, but it is 


to make such a measurement, but they can all be reduced to measurements 
in the computational basis if an appropriate unitary transformation is 
applied to the //-Qbit state of the computer just before carrying out the 
measurement. In this book the term “measurement” always means 
measurement in the computational basis. Measurements in other bases will 
always be treated as measurements in the computational basis preceded by 
suitable unitary transformations. There are also more general forms of 
measurement than von Neumann measurements, going under the 
unpleasant acronym POVM (for “positive operator-valued measure”). We 
shall make no explicit use of POVMs. 
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Fig 1.4 


A circuit diagram representing an w-Qbit measurement gate. 
The Qbits are initially described by the w-Qbit state 




n i 


0<x<2” 


on the left. After the measurement gate M n has acted, with probability 
p = \a x \ 2 indicates an integer x, 0 < x < 2 W , and the Qbits are 
subsequently described by the state \x n ) on the right. 


hallowed by three quarters of a century of use by quantum physicists, 
and impossible to avoid in treatments of quantum computation. One 
should avoid being misled by such spurious connotations of “measure¬ 
ment,” though it confused many physicists in the early days of quantum 
mechanics and may well continue to confuse some to this day. In quan¬ 
tum computation “measurement” means nothing more or less than 
applying and reading the display of an appropriate measurement gate, 
whose action is fully specified by the Born rule, as described above, 
and expanded upon below. While measurement in quantum mechan¬ 
ics is not at all like measuring somebody’s weight, it does have some 
resemblance to measuring Alice’s IQ_, which, one can argue, reveals no 
preexisting numerical property of Alice, but only what happens when 
she is subjected to an IQ^test. 

The simplest statement of the Born rule is for a single Qbit. If the 
state of the Qbit is the superposition (1.60) of the states |0) and |1) 
with amplitudes ao and a\ then the result of the measurement is 0 with 
probability Ic^o| 2 and 1 with probability la'll 2 . This measurement is 
carried out by a 1-Qbit measurement gate, as illustrated in Figure 1.5. 
We shall see below that n-Qbit measurement gates can be realized by 
applying 1-Qbit measurement gates to each of the n Qbits. The process 
of measurement can thus be reduced to applying multiple copies of a 
single elementary piece of hardware: the 1-Qbit measurement gate. 

In addition to displaying an n -bit integer with probabilities deter¬ 
mined by the amplitudes, there is a second very important aspect of the 
action of measurement gates: if n Qbits, initially described by a state 
I'F), are sent through an n -Qbit measurement gate, and the display of 
the measurement gate indicates the integer v, then one must associate 
with the Qbits emerging from that measurement gate the classical-basis 
state \x) n , as shown in Figures 1.4 and 1.5. This means that all traces 
of the amplitudes a x characterizing the input state have vanished from 
the output state. The only role they have played in the measurement is 
to determine the probability of a particular output. 
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Fig 1.5 


A special case of 
Figure 1.4: a 1-Qbit 
measurement gate. The 
reading x of the gate is 
either 0 or 1. 




o> 


+ a 


1 
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If the state of the input Qbits is one of the classical-basis states \x) n , 
then according to the Born rule the probability that the measurement 
gate will read a 1 and the output state will remain \x) n is 1. But for 
superpositions (1.69) with more than a single nonzero amplitude a x , the 
output state is not determined. Being a single one of the classical basis 
states \x) n , the output state no longer carries any information about 
the amplitudes characterizing the initial state, other than certifying that 
the particular amplitude a x was not zero, and, in all likelihood, was not 
exceedingly small. 

So once you send n Qbits through an n -Qbit measurement gate, 
you remove the possibility of extracting any further information about 
their original state | ). After such a measurement of five Qbits, if the 

result is 01100, then the post-measurement state associated with the 
Qbits is no longer |\k), but 101100). The original state |\k), with all 
the rich information potentially available in its amplitudes, is irretriev¬ 
ably lost. Qbits emerging from a measurement gate that indicates the 
outcome x are characterized by the state |v), regardless of what their 
pre-measurement state may have been. 

This change of state attendant upon a measurement is often re¬ 
ferred to as a reduction or collapse of the state. One says that the pre¬ 
measurement state reduces or collapses to the post-measurement state, as 
a consequence of the measurement. This should not be taken to imply 
(though, alas, it often is) that the Qbits themselves suffer a catastrophic 
“reduction” or “collapse.” It is important to keep in mind, in this con¬ 
text, that the state of n Qbits is nothing more than an abstract symbol, 
used, via the Born rule, to calculate probabilities of measurement out¬ 
comes. As has already been noted, there is no internal property of the 
Qbits that corresponds to their state. 

You might well wonder how one can learn anything at all of com¬ 
putational interest under these wretched conditions. The artistry of 
quantum computation consists of producing, through a cunningly con¬ 
structed unitary transformation, a superposition in which most of the 
amplitudes a x are zero or extremely close to zero, with useful infor¬ 
mation being carried by any of the values of x that have an appreciable 
probability of being indicated by the measurement. It is thus important 
to be seeking information that, once possessed, can easily be confirmed, 
perhaps with an ordinary (classical) computer (e.g. the factors of a large 
number), so that one is not misled by rare and irrelevant low-probability 
outcomes. How this is actually accomplished in various cases of interest 
will be one of our major preoccupations. 

It is important to note and immediately reject a possible misun¬ 
derstanding of the Born rule. One might be tempted to infer from 
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the rule that for a Qbit to be in a superposition, such as the state 
\i/f) = c^o|0) + a\ |1), means nothing more than that the “actual state” 
of the Qbit is either |0) with probability lo'o | 2 or |1) with probability 
\a\ | 2 . Such an assertion goes beyond the rule, of course, which merely 
asserts that if one subjects a Qbit in the state |i/f) to an appropriate 
test - a measurement - then the outcome of the test will be 0 or 1 
with those probabilities and the post-measurement state of the Qbit 
can correspondingly be taken to be |0) or 11). This does not imply that 
prior to the test the Qbit already carried the value revealed by the test 
and was already described by the corresponding classical-basis state, 
since, among other possibilities, the action of the test itself might well 
play a role in bringing forth the outcome. 

In fact, it is easy to produce examples that demonstrate that the Qbit, 
prior to the test, could not have been in either of the states |0) and 11). 
We can see this with the help of the Hadamard transformation (1.41). 
We have defined the action of the 1-Qbit operators H, X, and Z only 
on the computational-basis states |0) and 11), but, as noted above, we 
can extend their action to arbitrary linear combinations of these states 
by requiring the extensions to be linear operators. Since the states |0) 
and 11) form a basis, this determines the action of H, X, and Z on any 
1-Qbit state. 

Because it is linear and norm-preserving, H is unitary, and is there¬ 
fore the kind of operation a quantum computer can apply to the state 
of a Qbit: a 1-Qbit gate. The result of the operation of a Hadamard 
gate is to change the state 10) of a Qbit to H |0). Suppose, now, that we 
apply H to a Qbit that is initially in the state 

10) = 7|(l 0 > + |1»- (1.71) 

It follows from (1.45) that the result is just 

H|0) = |0). (1.72) 

So according to the Born rule, if we measure a Qbit described by the 
state H |0), the result will be 0 with probability 1. 

But suppose that a Qbit in the state \<p) were indeed either in the state 
|0) with probability \ or in the state |1) with probability In either 
case, according to (1.45), the subsequent action of H would produce 
a state - either (1/V2)(|0) + |1)) or (l/\/2)(|0) — |1)) - that under 
measurement yielded 0 or 1 with equal probability. This contradicts 
the fact just extracted directly from (1.72) that the result of making a 
measurement on a Qbit in the state H|0) is invariably 0. 

So a Qbit in a quantum superposition of |0> and 11) cannot be viewed 
as being either in the state 10) or in the state 11) with certain probabilities. 
Such a state represents something quite different. Although the Qbit 
reveals only a 0 or a 1 when you query it with a measurement gate, 
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prior to the query its state is not in general either |0) or |1), but a 
superposition of the form (1.60). Such a superposition is as natural 
and irreducible a description of a Qbit as |0) and 11) are. This point is 
expanded on in Appendix C. 

If the states of n Qbits are restricted to computational-basis states 
then the process of measurement is just like the classical process of 
“learning the value” of x without altering the state. Thus a quantum 
computer can be made to simulate a reversible classical computer by 
allowing only computational-basis states as input, and using only uni¬ 
tary gates that take computational-basis states into computational-basis 
states. 

The Born rule, relating the amplitudes a x in the expansion (1.64) 
of a general n-Qbit state |^) to the probabilities of measuring v, 
is often stated in terms of inner products or projection operators. 11 
The probability of a measurement giving the result x (0 < x < 2”) 
is 


P*(X)= |a,| 2 = |<x|'I')| 2 . (1.73) 

It can also be usefully expressed in terms of projection operators: 

p^(x) = (v |^) (^|v) = (v|Pxi/|v) (1-74) 

or 

P*(x) = (\V\x)(xm = (^|Pxl^), (1.75) 

where Py = |^)(^| is the projection operator on the state |'P), and 
P x = \x)(x\ is the projection operator on the state \x). 


1.9 The generalized Born rule 

There is a stronger version of the Born rule, which plays an important 
role in quantum computation, even though, surprisingly, it is rarely 
explicitly mentioned in most standard quantum-mechanics texts. We 
shall call it the generalized Born rule. This stronger form applies when 
one measures only a single one of n + 1 Qbits, by sending it through a 
standard 1-Qbit measurement gate. 

To formulate the generalized Born rule, note that any state of all 
n + 1 Qbits can be represented in the form 


Wn+l = OL 0 |0)|O 0 ) w +Q'i|l)|Oi) 


n *> 


i«oi -i-|«n 


, v / 


11 The Dirac notation for inner products and projection operators is 
described in Appendix A. 
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where |Oq) w and |3>i)„ are normalized (but not necessarily orthogonal). 
This follows directly from the general form, 

2»+i_j 2 w +i_ \ 

l^)« + l ^ ^ Y(x)\x)n + \, E |y(x)| 2 = 1. (1.77) 

x=() x=0 

The states |Oo) w and are given by 

2 n -l 2 n — l 

\®o)n = (1/ao) ^2 l^l>« = (1/ai) ^2 y ( 2 " + x )l x )»’ 

x=0 x=0 

(1.78) 

where 


Fig 1.6 


The action of a 
1-Qbit measurement gate 
on a single one of n + 1 
Qbits, according to the 
generalized Born rule. The 
initial state (on the left) is a 
general (n + 1)-Qbit state, 
expressed in the form 
|T) w+ i = ^o|0)l ( ho)« + 
a\ 11) |<£i) w . Only the single 
Qbit on the left of this 
expression is subjected to a 
measurement gate. 


r-\ 


2 n _l 


a 


o 


= 1 y(x )I 


a 


j = ly(2" + x)| : 


X 


=0 


X 


=0 


(1.79) 


(The ccq and a\ in (1.78) and (1.79) are real numbers, but can be mul¬ 
tiplied by arbitrary phase factors if |3>o) w and |3>i)„ are multiplied by 
the inverse phase factors.) 

The generalized Born rule asserts that if one measures only the single 
Qbit whose state symbol is explicitly separated out from the others in 
the (n + 1)-Qbit state (1.76), then the 1-Qbit measurement gate will 
indicate x (0 or 1) with probability \a x | 2 , after which the (n + 1)-Qbit 
state can be taken to be the product state \x) |O x ) w . (The rule holds for 
the measurement of any single Qbit - there is nothing special about the 
Qbit whose state symbol appears on the left in the (n + 1)-Qbit state 
symbol.) This action of a 1-Qbit measurement gate on an (n + 1)-Qbit 
state is depicted schematically in Figure 1.6. 

If the Qbit on which the 1-Qbit gate acts is initially unentangled with 
the remaining n Qbits, then the action of the gate on the measured Qbit 
is just that specified by the ordinary Born rule, and the unmeasured 
Qbits play no role at all, remaining in their original state throughout the 
process. This is evident from the above statement of the generalized 
Born rule, specialized to the case in which the two states |Oq) w and 
|3>i) w are identical. It is illustrated in Figure 1.7. 

If one applies the generalized Born rule n times to successive 
1-Qbit measurements of each of n Qbits, initially in the general 
w-Qbit state (1.69), one can show by a straightforward argument, given 
in Appendix E, that the final state of the n Qbits is x with probability 
\a x | 2 , where x is the n -bit integer whose bits are given by the readings 
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Fig 1.7 


A simplification of 
Figure 1.6 when 
|<J> 0 ) = |Oi) = |O). In this 
case the initial state on the 
left is just the product state 

Wn = mm = 

(#o|0) + 0 i|l))|$>), and 
the final state of the 
unmeasured Qbits 
continues to be |d>) 
regardless of the value of x 
indicated by the 1-Qbit 
measurement gate. The 
unmeasured Qbits are 
unentangled with the 
measured Qbit and 
described by the state |d>) 
throughout the process. 
The 1-Qbit measurement 
gate acts on the measured 
Qbit exactly as it does in 
Figure 1.5 when no other 
Qbits are present, and the 
generalized Born rule of 
Figure 1.6 reduces to the 
ordinary Born rule. 


¥} = a |o^ + a l 



X 
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on the n 1-Qbit measurement gates. This is nothing but the ordinary 
Born rule, with the n 1-Qbit measurement gates playing the role of 
the single n-Qbit measurement gate. There is thus, as remarked upon 
above, only a single primitive piece of measurement hardware: the 1- 
Qbit measurement gate. The construction of an n-Qbit measurement 
gate out of n 1-Qbit measurement gates is depicted in Figure 1.8. 

An even more general version of the Born rule follows from the 
generalized Born rule itself. The general state of m + n Qbits can be 
written as 


l^)m+M — ^ ^ | x) m | ) n , (1.80) 

x=0 

where TA \a x: | 2 = 1 and the states \ < & x ) n are normalized, but not nec- 
essarily orthogonal. By applying the generalized Born rule m times to 
m Qbits in an (m + n)-Qbit state, one establishes the rule that if just 
the m Qbits on the left of (1.80) are measured, then with probability 
\a x \ 2 the result will be a 1 , and after the measurement the state of all 
m + n Qbits will be the product state 

(1.81) 

in which the m measured Qbits are in the state \x) m and the n unmea¬ 
sured ones are in the state | <!>*)„. 


1.10 Measurement gates and state preparation 

In addition to providing an output at the end of a computation, mea¬ 
surement gates also play a crucial role (which is not often emphasized) 
at the beginning. Since there is no way to determine the state of a given 
collection of Qbits - indeed, in general such a collection might be entan¬ 
gled with other Qbits and therefore not even have a state of its own - how 
can one produce a set of Qbits in a definite state for the gates of a quan¬ 
tum computer to transform into another computationally useful state? 

The answer is by measurement. If one takes n Qbits off the shelf, 
and subjects them to an ^-Qbit measurement gate that registers 
then the Qbits emerging from that gate are assigned the classical-basis 
state \x) n .If one then applies the 1-Qbit operation X to each Qbit that 
registered a 1 in the measurement, doing nothing to the Qbits that 
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Fig 1.8 


Constructing a 
4-Qbit measurement gate 
out of four 1-Qbit 
measurement gates. The 
integer x has the binary 
expansion x^xixix^. 


registered 0, the resulting set of Qbits will be described by the state 
|0)„. It is this state that most quantum-computational algorithms take 
as their input. Such a use of a measurement gate to produce a Qbit 
described by the state |0) is shown in Figure 1.9. 

Measurement gates therefore play two roles in a quantum com¬ 
putation. They get the Qbits ready for the subsequent action of the 
computer, and they extract from the Qbits a digital output after the 
computer has acted. The initial action of the measurement gates is 
called state preparation, since the Qbits emerging from the process can 
be characterized by a definite state. The association of unitary oper¬ 
ators with the gates that subsequently act on the Qbits permits one 
to update that initial state assignment into the corresponding unitary 
transformation of the initial state, thereby making it possible to calcu¬ 
late, using the Born rule, the probabilities of the outcomes of the final 
measurement gates. 

This role of measurement gates in state preparation follows from 
the Born rule if the Qbits that are to be prepared already have a state of 
their own, even though that state might not be known to the user of the 
quantum computer. It also follows from the generalized Born rule if the 
Qbits already share an entangled state - again, not necessarily known to 
the user - with additional (unmeasured) Qbits. But one cannot deduce 
from the Born rules that measurement gates serve to prepare states 
for Qbits “off the shelf,” whose past history nobody knows anything 
about. In such cases the use of measurement gates to assign a state to 
the Qbits is a reasonable and plausible extension of the Born rules. It 
is consistent with them, but goes beyond them. 

For particular physical realizations of Qbits, there may be other ways 
to produce the standard initial state |0)„. Suppose, for example, that 
each Qbit is an atom, the state |0) is the lowest-energy state (the ground 
state) of the atom, and the state 11) is the atomic state of next-lowest 
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Fig 1.9 


Using a 1-Qbit 
measurement gate to 
prepare an off-the-shelf 
Qbit so that its associated 
state is |0). The input on 
the left is a Qbit in an 
unknown condition - i.e. 
nothing is known of its past 
history. After the 
measurement gate is 
applied, the NOT gate X is 
or is not applied, 
depending on whether the 
measurement gate 
indicates 1 or 0. The Qbit 
that emerges (on the right) 
is described by the state |0). 


9 




energy (the first excited state). Then one can produce the state |0)„ by 
cooling n such atoms to an appropriately low temperature (determined 
by the energy difference between the two states - the smaller that 
energy, the lower the temperature must be). 

From the conceptual point of view, state preparation by the use of 
measurement gates is the simplest way. An acceptable physical can¬ 
didate for a Qbit must be a system for which measurement gates are 
readily available. Otherwise there would be no way of extracting infor¬ 
mation from the computation, however well the unitary gates did their 
job. So the hardware for state preparation by measurement is already 
there. Whether one chooses to use it or other (e.g. cryogenic) methods 
to initialize the Qbits to the state |0)„ is a practical matter that need 
not concern us here. It is enough to know that it can always be done 
with measurement gates. 


1.11 Constructing arbitrary 1-and 2-Qbit states 

The art of quantum computation is to construct circuits out of 1- 
and 2-Qbit gates that produce final states capable of revealing useful 
information, when measured. The expectation is that 1-Qbit gates 
will be comparatively easy to construct. Two-Qbit gates that are not 
mere tensor products of 1-Qbit gates are likely to be substantially more 
difficult to make. Attention has focused strongly on the cNOT gate, 
and gates that can be constructed from it in combination with 1-Qbit 
unitaries. All of the circuits we shall be examining can, in fact, be 
reduced to combinations of 1-Qbit gates and 2-Qbit cNOT gates. Given 
the difficulty in making cNOT gates, it is generally considered desirable 
to keep their number as small as possible. As an illustration of such 
constructions, we now examine how to assign arbitrary states to one or 
two Qbits, starting with the standard 1-Qbit state |0) or the standard 
2-Qbit state 100). (Both of these standard states can be produced with 
the help of measurement gates, as described in Section 1.10.) 

The situation for 1-Qbit states is quite simple. Let |0) be any 1-Qbit 
state, and let |0) be the orthogonal state (unique to within an overall 
phase), satisfying (0| 0) = 0. Since |0) and |1) are linearly indepen¬ 
dent, there is a unique linear transformation taking them into | 0) and 
|0). But, since 10) and |0) are an orthonormal pair (as are |0) and 
|1)), this linear transformation is easily verified to preserve the norm 
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of arbitrary states, so it is a unitary transformation u. Thus, for any \i/f) 
there is a 1-Qbit unitary gate u that takes |0) into |0): 

m = u\0). (1.82) 

Things are more complicated for 2-Qbit states. An unentangled 2- 
Qbit state, being the product of two 1-Qbit states, can be constructed 
out of 100) by the application of 1 -Qbit unitaries to each of the two Qbits. 
But a general 2-Qbit state is entangled, and its production requires 
a 2-Qbit gate that is not just a tensor product of 1-Qbit unitaries. 
Interestingly, a single cNOT gate, combined with 1-Qbit unitaries, is 
enough to do the trick. 

To see this, note that the general 2-Qbit state, 

l^k) = afoolOO) + aoi|01) + aio 110) + an 111), (1.83) 

is of the form 

|vp) = |0)® |0) + |1>® |0), (1.84) 

where |0) = o'oo |0) + a'oi 11 ) and |0) = o'io1 0) + a\\ |1). Apply u ® 1 
to |^), where u is a linear transformation, whose action on the com¬ 
putational basis is of the form 

u|0) = a|0) + b\\), u|l) = -£*|0) + «*|1); \a\ 2 + \b\ 2 = \. 

(1.85) 

The transformation u is unitary because it preserves the orthogonality 
and normalization of the basis |0), 11). 

We have 

(u 0 1)1*} = (fl|0> + A|l>) 0 IVO + ( - **|0) +«*|1>) 0 \4>) 

= |0) ® \f) + |1) 0 10'), (1.86) 

where 

W) =aW)-b*\4>), |0') =b\1r)+a*\4>). (1.87) 

We would like to choose the complex numbers a and b to make \(p') 
and |0 r ) orthogonal. The inner product (0 / |0 / ) is 

(<t>'W) = a 2 (m) - b* 2 (f \<t>) + - {(p\4>)). (1.88) 

If (0|0) ^ 0, then setting (0 / |0 / ) to 0 gives a quadratic equation for 
a / b* , which has two complex solutions. If a in (1.85) is any nonzero 
complex number then either solution determines b , which, with a, 
gives a 1-Qbit unitary u for which 

(u ® l)|vI/> = |0) 0 \f') + |1) 0 \4>') (1.89) 

where |0 ; ) and |0') are orthogonal. If (0|0) = 0 then (1.84) is already 
of this form with u = 1. 
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We can pick positive real numbers X and ji so that | 0") = |0')/A.and 
|0") = | <p')/jJL are unit vectors, making |0") and 10") an orthonormal 
pair. They are therefore related to |0) and 11) by a unitary transforma¬ 
tion v: 


l^")=v|0), |0")=v|l). (1.90) 

Equation (1.89) then gives 12 

|vl/} = ( u t ®v)(A.|0> ® |0> +n\l) ® |1>). (1.91) 

We can write this as 

I'p) = (u f ® v)Cio(A|0) + Mil)) ® |0). (1.92) 

Since \ x &) is a unit vector and unitary transformations preserve unit 
vectors, it follows from (1.91) that A|0) + fi |1) is a unit vector. It can 
therefore be obtained from |0) by a unitary transformation w. So 

|vp) — (V ( 8 ) v)Cio(w 0 l)(|0) 0 |0)) = u|v 0 Ci 0 wi|00). (1.93) 

We have thus established that a general 1-Qbit state |*E) can be 
constructed out of three 1-Qbit unitaries and a single cNOT gate, acting 
on the standard state |00). This is an early example of the usefulness 
of cNOT gates. 


1.12 Summary: Qbits versus Cbits 

Table 1.1 gives a concise comparison of the elementary properties of 
Cbits and Qbits. The table uses the term “Bit,” with an upper-case 
B, to mean “Qbit or Chit,” which should be distinguished from “bit,” 
with a lower-case b, which means u 0 or 1.” Alice (in the fifth line of the 
table) is anybody who knows the relevant history of the Qbits - their 
initial state preparation and the unitary gates that have subsequently 
acted on them. 


12 This form for a general vector in a space of 2 x 2 dimensions is a special 
case of a more general result for d xi dimensions known as the polar (or 
Schmidt) decomposition theorem. 
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Table 1.1. A summary of the features of Qbits, contrasted to the analogous features of Chits 



Cbits 

Qbits 

States of n Bits 

\X) n , 0 <X < 2 n 

^2 a x \x) n , X l tt x| 2 = 1 

Subsets of n Bits 

Always have states 

Generally have no states 

Reversible operations on states 

Permutations 

Unitary transformations 

Can state be learned from Bits? 

Yes 

No 

To learn state of Bits 

Examine them 

Go ask Alice 

To get information from Bits 

Just look at them 

Measure them 

Information acquired 

X 

x with probability |a x | 2 

State after information acquired 

Same: still |x) 

Different: now |x) 














Chapter 2 


General features and some 
simple examples 

2.1 The general computational process 

A suitably programmed quantum computer should act on a number x 
to produce another number f(x) for some specified function /. Appro¬ 
priately interpreted, with an accuracy that increases with increasing k, 
we can treat such numbers as non-negative integers less than 2 k . Each 
integer is represented in the quantum computer by the corresponding 
computational-basis state of k Qbits. 

If we specify the numbers x as n -bit integers and the numbers f(x) 
as m-bit integers, then we shall need at least n + m Qbits: a set of 
n -Qbits, called the input register, to represent x, and another set of m- 
Qbits, called the output register , to represent f(x). Qbits being a scarce 
commodity, you might wonder why we need separate registers for input 
and output. One important reason is that if f(x) assigns the same 
value to different values of x, as many interesting functions do, then 
the computation cannot be inverted if its only effect is to transform the 
contents of a single register from x to f{x). Having separate registers for 
input and output is standard practice in the classical theory of reversible 
computation. Since quantum computers must operate reversibly to 
perform their magic (except for measurement gates), they are generally 
designed to operate with both input and output registers. We shall find 
that this dual-register architecture can also be usefully exploited by a 
quantum computer in some strikingly nonclassical ways. 

The computational process will generally require many Qbits be¬ 
sides the n + m in the input and output registers, but we shall ignore 
these additional Qbits for now, viewing a computation of f as doing 
nothing more than applying a unitary transformation, U/, to the n + m 
Qbits of the input and output registers. We take up the fundamental 
question of why the additional Qbits can be ignored in Section 2.3, 
only noting for now that it is the reversibility of the computation that 
makes this possible. 

We define the transformation U/ by specifying it as a reversible 
transformation taking computational-basis states into computational- 
basis states. As noted in Section 1.6, the linear extension of such a 
classically meaningful transformation to arbitrary complex superpo¬ 
sitions of computational-basis states is necessarily unitary. The stan¬ 
dard quantum-computational protocol, which we shall use repeatedly, 
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defines the action of U/ on the computational-basis states \x) n \y) m of 
the n + m Qbits making up the input and output registers as follows: 

^ f n\y) m^) — l^)« IF © 1) 

where © indicates modulo-2 bitwise addition (without carrying) or, 
if you prefer, the bitwise exclusive OR. If x and y are m-bit integers 
whose j th bits are xj and yj , then x © y is the m -bit integer whose j th 
bit is xj © j /j. Thus 1101 ©0111 = 1010. This is a straightforward 
generalization of the single-bit © defined in Section 1.3. 

If the initial value represented by the output register is y = 0 then 
we have 


U/(|x)J0) m ) = \x) n \f(x)) m (2.2) 

and we do indeed end up with f(x) in the output register. Regardless 
of the initial value of j/, the input register remains in its initial state 

| X ) n • 

The transformation (2.1) is clearly invertible. Indeed, U/ is its own 
inverse: 


U/U/( \x)\y)) = U/( \x)\y © /(#))) 

= I x)\y 0 f{x) © f(x)) = \x)\y), (2.3) 

since z © z = 0 for any 2 . (From this point on I shall use subscripts that 
specify the numbers of Qbits only when it is important to emphasize 
what those numbers are.) 

The form (2.2) inspires the most important trick of the quantum- 
computational repertoire. If we apply to each Qbit in the 2-Qbit state 
10) 10) the 1-Qbit Hadamard transformation H (Equation (1.45)), then 
we get 


(H <g> H)(|0) ® |0» = H 1 H 0 |0}|0> = (H|0})(H|0» 

= ©(l°) + l 1 »©(|0> + l 1 » 

= |(| 0 )| 0 ) + | 0 >| 1 > + | 1 >| 0 > + | 1 )| 1 » 

= 5 (| 0)2 + | 1)2 + | 2)2 + | 3 ) 2 )- ( 2 . 4 ) 


This clearly generalizes to the n -fold tensor product of n Hadamards, 
applied to the n-Qbit state |0)„: 

h -|°) h = T E \x)n, (2.5) 

0<x<2 n 


where 


= H ® H ® ® H, n times. (2.6) 


So if the initial state of the input register is |0)„ and we apply an n -fold 
Hadamard transformation to that register, its state becomes an equally 



38 


GENERAL FEATURES AND SOME SIMPLE EXAMPLES 


weighted superposition of all possible n -Qbit inputs. If we then apply 
U/ to that superposition, with 0 initially in the output register, then 
by linearity we get from (2.5) and (2.2) 

U / (H®' ! ®l m )(|0> B |0> m ) = -h U / (|x>„|0> m ) 

0<x<2 n 

= 2^2 21 W)n\f{x))m- (2.7) 

0<x<2 w 


This contains an important part of the magic that underlies quantum 
computation. If before letting U/ act, we merely apply a Hadamard 
transformation to every Qbit of the input register, initially in the stan¬ 
dard state |0) w , the result of the computation is described by a state 
whose structure cannot be explicitly specified without knowing the 
result of all 2 n evaluations of the function f. So if we have a mere 
hundred Qbits in the input register, initially all in the state 10) ioo (and 
m more in the ouput register), if a hundred Hadamard gates act on the 
input register before the application of U/, then the form of the final 
state contains the results of 2 100 ~ 10 30 evaluations of the function /. 
A billion billion trillion evaluations! This apparent miracle is called 
quantum parallelism. 

But a major part of the miracle is only apparent. One cannot say 
that the result of the calculation is 2” evaluations of /, though some 
practitioners of quantum computation are rather careless about making 
such a claim. All one can say is that those evaluations characterize the 
form of the state that describes the output of the computation. One 
knows what the state is only if one already knows the numerical values 
of all those 2” evaluations of /. Before drawing extravagant practical, 
or even only metaphysical, conclusions from quantum parallelism, it 
is essential to remember that when you have a collection of Qbits in a 
definite but unknown state, there is no way to find out what that state is. 

If there were a way to learn the state of such a set of Qbits, then every¬ 
one could join in the rhapsodic chorus. (Typical verses: “Where were 
all those calculations done? In parallel universes!” “The possibility of 
quantum computation has established the existence of the multiverse.” 
“Quantum computation achieves its power by dividing the computa¬ 
tional task among huge numbers of parallel worlds.”) But there is no 
way to learn the state. The only way to extract any information from 
Qbits is to subject them to a measurement. 

When we send all n + m Qbits through measurement gates, the 
Born rule tells us that if the state of the registers has the form (2.7), 
then with equal probability the result of measuring the Qbits in the 
input register will be any one of the values of x less than 2 W , while 
the result of measuring the Qbits in the ouput register will be the 
value of f for that particular value of x. So by measuring the Qbits 
we can learn a single value of f as well as learning a single (random) 



2.1 THE GENERAL COMPUTATIONAL PROCESS 


39 


xq at which f has that value. After the measurement the state of the 
registers reduces to |vo)|/(vo)) and we are no longer able to learn 
anything about the values of f for any other values of x. So although 
we can learn something from the output of the “parallel computation,” 
it is nothing more than what we would have learned had we simply run 
the computation starting with a classical state \x) in the input register, 
with the value of x chosen randomly. That, of course, could have been 
done with an ordinary classical computer. 

To be sure, a hint of a miracle remains - hardly more than the smile 
of the Cheshire cat - in the fact that in the quantum case the random 
selection of the x, for which f(x) can be learned, is made only after 
the computation has been carried out. (To assert that the selection 
was made before the computation was done is to make the same error 
as asserting that a Qbit described by a superposition of the states |0) 
and 11) is actually in one or the other of them, as discussed in Section 
1.8.) This is a characteristic instance of what journalists like to call 
“quantum weirdness,” in that (a) it is indeed vexing to contemplate 
the fact that the choice of the value of x for which f can be learned 
is made only after - quite possibly long after - the computation has 
been finished, but (b) since that choice is inherently random - beyond 
anyone’s power to control in any way whatever - it does not matter for 
any practical purpose whether the selection was made astonishingly 
after or boringly before the calculation was executed. 

If, of course, there were an easy way to make copies of the output 
state prior to making the measurement, without running the whole 
computation over again, then one could, with high probability, learn 
the values of f for several different (random) values of x. But such 
copying is prohibited by an elementary result called the “no-cloning 
theorem,” which states that there is no such duplication procedure: 
there is no unitary transformation that can take the state \x//) n |0)„ into 
the state |V r )»IV r )» f° r arbitrary \x//) n . 

The no-cloning theorem is an immediate consequence of linearity. 
If 

U(lV f )|0» = \f)\f) and U(|0>|O»= \4>)\cp), (2.8) 

then it follows from linearity that 

yi(a\i/r) + b\(p))\0) = a UIVOIO) + HJ|0)|O) = a\ifr)\\lr) + b\(p)\(p). 

(2.9) 

But if U cloned arbitrary inputs, we would have 

+b\<t)})\0} = (a\f) + b\<j>))(a\f) +b\4 >>) 

= a 2 \i//)\ijf) + b 2 \<j>)\<l>) + ab\i//)\<j>) + ab\<j>)\il/), 

( 2 . 10 ) 
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which differs from (2.9) unless one of a and b is zero. Surprisingly, 
this very simple theorem was not proved until half a century after the 
discovery of quantum mechanics, presumably because it took that long 
for it to occur to somebody that it was an interesting proposition to 
formulate. 

Of course, the ability to clone to a reasonable degree of approxima¬ 
tion would be quite useful. But this is also impossible. Suppose that U 
approximately cloned both |0) and |0): 

U(lV f >|0» « WM) and U(|0>|O» « \<P )\</»). (2.11) 

Then since unitary transformations preserve inner products, since the 
inner product of a tensor product of states is the (ordinary) product of 
their inner products, and since (010) = 1, it follows from (2.11) that 

(M) « (mf- (2.i2) 

But this requires (0| 0) to be either close to 1 or close to 0. Hence a 
unitary transformation can come close to cloning both of two states 
| \j/) and |0) only if the states are very nearly the same, or very close to 
being orthogonal. In all other cases at least one of the two states will be 
badly copied. 

If this were the full story, nobody but a few philosophers would be 
interested in quantum computation. The National Security Agency 
of the United States of America is interested because there are more 
clever things one can do. Typically these involve applying additional 
unitary gates to one or both of the input and output registers before 
and/or after applying U/, sometimes intermingled with intermediate 
measurement gates acting on subsets of the Qbits. All these additional 
gates are cunningly chosen so that when one finally does measure all 
the Qbits, one extracts useful information about relations between the 
values of f for several different values of v, which a classical computer 
could get only by making several independent evaluations. The price 
one inevitably pays for this relational information is the loss of the 
possibility of learning the actual value f(x) for any individual x. This 
tradeoff of one kind of information for another is typical of quantum 
computation, and typical of quantum physics in general, where it is 
called the uncertainty principle. The principle was first enunciated by 
Werner Heisenberg in the context of mechanical information - the 
position of a particle versus its momentum. 

So it is wrong and deeply misleading to say that in the process that 
assigns the state (2.7) to the Qbits, the quantum computer has evaluated 
the function f{x) for all x in the range 0 < x < 2 n . Such assertions are 
based on the mistaken view that the quantum state encodes a property 
inherent in the Qbits. The state encodes only the possibilities available 
for the extraction of information from those Qbits. You should keep 
this in mind as we examine some of the specific ways in which this 
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nevertheless permits a quantum computer to perform tricks that no 
classical computer can accomplish. 


2.2 Deutsch's problem 

Deutsch’s problem is the simplest example of a quantum tradeoff that 
sacrifices particular information to acquire relational information. A 
crude version of it appeared in a 1985 paper by David Deutsch that, 
together with a 1982 paper by Richard Feynman, launched the whole 
field. In that early version the trick could be executed successfully only 
half the time. It took a while for people to realize that the trick could 
be accomplished every single time. Here is how it works. 

Let both input and output registers each contain only one Qbit, so 
we are exploring functions f that take a single bit into a single bit. 
There are two rather different ways to think about such functions. 

(1) The first way is to note that there are just four such functions, as 
shown in Table 2.1. Suppose that we are given a black box that calcu¬ 
lates one of these four functions in the usual quantum-computational 
format, by performing the unitary transformation 

U/( |x)|j/>) = \x)\y © f(x)), (2.13) 

where the state on the left is that of the 1-Qbit input register (/), and 
the state on the right is that of the 1-Qbit output register ( o ). Using 
the forms in Table 2.1 and the explicit structure (2.13) of Uy, you can 
easily confirm that 

U/ 0 = 1, U/i = Qo, U f2 = C l0 X 0 , U/ 3 = X 0 , (2.14) 

where 1 is the (2-Qbit) unit operator, C l0 is the controlled-NOT with 
the input Qbit as control and the output as target, and X 0 acts as NOT 
on the output register. These possibilities are illustrated in the circuit 
diagram of Figure 2.1. 

Suppose that we are given a black box that executes U / for one of 
the four functions, but are not told which of the four operations (2.14) 
the box carries out. We can, of course, find out by letting the black box 


Table 2.1 . The four distinct functions 
fj(x) that take one bit into one bit 
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Fig 2.1 


construct, with elementary 
gates, each of the black 
boxes U f that realize the 
four possible functions f 
that appear in Deutsch’s 
problem. In case 00 f is 
identically 0 and it is 
evident from the general 
form at the top of the figure 
that U f acts as the identity. 
In case 01 f(x) = v, so U/ 
acts as cNOT, with the 
input register as the 
control Qbit. In case 10/ 
interchanges 0 and 1, so 
U f applies NOT to the 
target Qbit if and only if 
the computational-basis 
state of the control Qbit is 
|0). This is equivalent to 
combining a cNOT with an 
unconditional NOT on the 
target Qbit. In case 11 / is 
identically 1, and the effect 
of U/ is just to apply NOT 
to the output register, 
whatever the state of the 
input register. Note the 
diagrammatic convention 
for controlled operations: 
the control Qbit is 
represented by the wire 
with the black dot on it; the 
target Qbit is connected to 
the control by a vertical 
line ending in a box 
containing the controlled 
operation. An alternative 
representation for cNOT 
appears in Figure 2.7. 


\x> 

\y> 



\x> 

b ©/(*)> 


m /(i) 



act twice - first on |0)|0) and then on |1)|0). But suppose that we can 
only let the box act once. What can we learn about /? 

In a classical computer, where we are effectively restricted to letting 
the black box act on Qbits in one of the four computational-basis states, 
we can learn either the value of /(0) (if we let U f act on either |0) |0) or 
|0) 11)) or the value of /(1) (if we let U/ act on either 11) |0) or 11) 11)). 
If we choose to learn the value of /(0), then we can restrict f to being 
either jo or f\ (if /(0) = 0) or to being either f 2 or f (if /(0) = 1). 
If we choose to learn the value of /(l), then we can restrict f to being 
either f 0 or f 2 (if /(l) = 0) or to being either f or / 3 (if /(l) = 1). 

Suppose, however, that we want to learn whether f is constant 
(/(0) = /(1), satisfied by f 0 and / 3 ) or not constant (/(0) ^ /(1), 
satisfied by f\ and fi). We then have no choice with a classical computer 
but to evaluate both /(0) and /(l) and compare them. In this way we 
determine whether or not f is constant, but we have to extract complete 
information about f to do so. We have to run U y twice. 

Remarkably, it turns out that with a quantum computer we do not 
have to run U f twice to determine whether or not f is constant. We 
can do this in a single run. Interestingly, when we do this we learn 
nothing whatever about the individual values of /(0) and /(l), but 
we are nevertheless able to answer the question about their relative 
values: whether or not they are the same. Thus we get less information 
than we get in answering the question with a classical computer, but 
by renouncing the possibility of acquiring that part of the information 
which is irrelevant to the question we wish to answer, we can get the 
answer with only a single application of the black box. 
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(2) There is a second way to look at Deutsch’s problem, which gives 
it nontrivial mathematical content. One can think of x as specifying a 
choice of two different inputs to an elaborate subroutine that requires 
many additional Qbits, and one can think of f(x) as characterizing 
a two-valued property of the output of that subroutine. For example 
f(x) might be the value of the millionth bit in the binary expansion of 
a/2 + v so that /(0) is the millionth bit in the expansion of a/2 while 
/(1) is the millionth bit of \/3. In this case the input register feeds 
data into the subroutine and the subroutine reports back to the output 
register. 

In the course of the calculation the input and output registers will in 
general become entangled with the additional Qbits used by the sub¬ 
routine. If the entanglement persists to the end of the calculation, the 
input and output registers will have no final states of their own, and it 
will be impossible to describe the computational process as the simple 
unitary transformation (2.1). We shall see in Section 2.3, however, that 
it is possible to set things up so that at the end of the computation the 
additional Qbits required for the subroutine are no longer entangled 
with the input and output registers, so that the additional Qbits can in¬ 
deed be ignored. The simple linear transformation (2.1) then correctly 
characterizes the net effect of the computation on those two registers. 

Under interpretation (1) of Deutsch’s problem, answering the ques¬ 
tion of whether f is or is not constant amounts to learning something 
about the nature of the black box that executes U f without actually 
opening it up and looking inside. Under interpretation (2) it becomes 
the nontrivial question of whether the millionth bits of a/2 and a/3 
agree or disagree. Under either interpretation, to answer the question 
with a classical computer we can do no better than to run the black box 
twice, with both 0 and 1 as inputs, and compare the two outputs. 

In the quantum case we could try the standard trick, preparing the 
input register in the superposition (1/V2)(|0) + |1)). After a single 
application of U f the final state of the 1-Qbit input and output registers 
would then be 

U/(H ® 1)(|0>|0>) = ^|0)|/(0)> + ^|1>|/(1)>, (2.15) 

as described in (2.7). If we then measured the input and ouput registers 
we could learn, under case (2), the millionth bit of either \Jl or V3, 
as well as learning which we had learned. The choice of which we did 
learn would be random. This procedure offers no improvement on the 
classical situation. 

It was first noticed that, without making any further use of U /, there 
are additional unitary transformations one can apply to the state (2.15) 
before carrying out the measurement that enable you half the time 
to state with assurance whether or not /(0) = /(1). (This imperfect 
solution to Deutsch’s problem has some interesting features, which we 
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explore further in Appendix F.) Some time later, it was realized that 
you can always answer the question if you apply appropriate unitary 
transformations before as well as after the single application of U /. Here 
is how the trick is done. 

To get the output (2.15) we took the input to U/ to be the state 

(H®l)(|0>|0>). (2.16) 

Instead of doing this, we again start with both input and output registers 
in the state |0), but then we apply the NOT operation X to both 
registers, followed by an application of the Hadamard transform to 
both. Since X|0) = |1) and H|l) = (l/\/2)(|0) — 1 1)), the input to U / 
is now described by the state 

(H® H)(X®X)(|0>|0>) = (H® H)(|1>|1>) 

= ( 75 ' 0 >- 7 !' 1 ))( 7 ! |0) -* |1) ) 

= |(| 0 )| 0 ) — | 1 ) | 0 > — | 0 )| 1 ) + | 1 )| 1 >). 

(2.17) 

If we take the state (2.17) as input to U /, then by linearity the resulting 
state is 

H u /(I0>I°» - U/(|1>|0» - U/(|0)11>) + U/(|l)|l»). (2.18) 

It follows from the explicit form (2.13) of the action of U f on the 
computational-basis states that this is simply 

Hl°>l/(°)> - |l)l/(l)> - |0)|/(0)> + |1)|/(1)>), (2.19) 

r>J r+j 

where, as earlier, x - 1 © x so that 1 = 0 and 0=1, and f(x) = 
1© f(x). So if /(0) = /(l) the ouput state (2.19) is 

Hl°) - |l>)(l/(0)> - 1/(0))). m = /(1), (2.20) 

but if /(0) # /(1) then /(1) = /(0), /(l) = /(0), and the output 
state (2.19) becomes 

Hl0> + |l>)(l/(0)>-l/(0)>), /(0)#/(l). (2.21) 

If, finally, we apply a Hadamard transformation to the input register, 
these become 


|l>^(l/(0)>-|/(0)», /(0) = /(l), 
|0}^(|/(0)>-|/(0)>), /(0)#/(l). 


( 2 . 22 ) 

(2.23) 
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On putting together all the operations in a form we can compare 
with the more straightforward computation (2.15), we have 


(H ® 1)U/(H <g> H)(X ® X)(|0>|0>) 

' |l>7 ? (l/(0)>-|/(0)}), /(0) = /(l), 

= - V2 ) < (2.24) 

|0)^(|/(0)>-|/(0))j, /(0)#/(l). 


Thus the state of the input register has ended up as |1) or |0) de¬ 
pending on whether or not /(0) = /(l), so by measuring the input 
register we can indeed answer the question of whether /(0) and /(1) 
are or are not the same! 

Notice that in either case the output register is left in the state 
(1 /a/ 2)(|/(0)) — |/(0))). Because the two terms in the superposition 
have amplitudes with exactly the same magnitude, if one measures the 


output register the result is equally likely to be /(0) or /(0), and one 
learns absolutely nothing about the actual value of /(0). The output 
register contains no useful information at all. 

Another way to put it is that the final state of the output register 
is ±(l/\/2)(|0) — |1>) depending on whether /(0) = 0 or /(0) = 1. 
Since a change in the overall sign of a state (or, more generally, the 
presence of an overall complex factor of modulus 1) has no effect on 
the statistical distribution of measurement outcomes, there is no way 
to distinguish between these two cases. 

Thus the price one has paid to learn whether /(0) and /(l) are or 
are not the same is the loss of any information whatever about the actual 
value of either of them. One has still eliminated only two of the four 
possible forms for the function /. What the quantum computer gives 
us is the ability to make this particular discrimination with just a single 
invocation of the black box. No classical computer can do this. 

There is a rather neat circuit-theoretic way of seeing why this trick 
enables one to learn whether or not /(0) = /(l) in just one application 
of U/, without going through any of the above algebraic manipulations. 
This quite different way of looking at Deutsch’s problem is illustrated 
in Figures 2.1-2.3. The basic idea is that for each of the four possible 
choices for the function /, the 2-Qbit unitary transformation U f be¬ 
haves in exactly the same way as the equivalent circuit constructed out 
of a NOT and/or a cNOT gate pictured in Figure 2.1. Consequently 
applying Hadamard gates to each Qbit, both before and after the ap¬ 
plication of U/, must produce exactly the same result as it would if 
the Hadamards were applied to the equivalent circuits in Figure 2.1. 
Using the elementary identities in Figure 2.2, one easily demonstrates 
that those results are as shown in Figure 2.3. But Figure 2.3 shows ex¬ 
plicitly that when U/ is so sandwiched between Hadamards, the input 
register ends up in the state |0) if /(0) = /(1) and in the state 11) if 

m # /(i). 
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Fig 2.2 


Some elementary 
circuit identities, (a) 

H 2 = 1. (b) HXH = Z. (c) 
A consequence of (a) and 
(b). (d) A consequence of 
(a) and (c). (e) The action 
of the controlled-Z gate 
does not depend on which 
Qbit is control and which is 
target, since it acts as the 
identity on each of the 
states |00), |01), and 110) 
and multiplies the state 
111) by — 1. (f) This follows 
from (d), (a), and (e). 



When one thinks of applying this to learn whether the millionth bits 
of a/2 and \/3 are the same or different, as in the second interpretation 
of Deutsch’s problem, it is quite startling that one can do this with no 
more effort (except for a simple modification of the initial and final 
states) than one uses to calculate the millionth bit of either a/2 or \/3. 
In this case, however, there is an irritating catch, which we note at the 
end of Section 2.3. 


2.3 Why additional Qbits needn't mess things up 

Now that we have a specific example of a quantum computation to keep 
in mind, we can address an important and very general issue mentioned 
in Section 2.1. The computational process generally requires the use 
of many Qbits besides the n + m in the input and output registers. In 
the second interpretation of Deutsch’s problem, it may need a great 
many more. The action of the computer is then described by a uni¬ 
tary transformation Wy that acts on the space associated with all the 
Qbits: those in the input and output registers, together with the r 
additional Qbits used in calculating the function /. Only under very 
special circumstances will this global unitary transformation W f on 
all n + m + r Qbits induce a transformation on the input and output 
registers that can be be described by a unitary transformation U / that 
acts only on those two registers, as in (2.1). In general the input and 
output registers will become entangled with the states of the additional 
r Qbits, and cannot even be assigned a state. 
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But if the action of the computer on all n + m + r Qbits has a very 
special form, then the input and output registers can indeed end up 
with a state, related to their initial states through the desired unitary 
transformation U /. Let the additional r Qbits start off in some standard 
initial state | xj/) r , so that the initial state of input register, output register, 
and additional Qbits is 

l^)«+m+r — I n II) m I VOr • (2.25) 

Although the r additional Qbits may well become entangled with those 
in the input and output registers in the course of the calculation - they 
will have to if they are to serve any useful purpose - we require that 
when the calculation is finished the final state of the computer must be 
of the form 


\N f \^) n+m+r = \x) n \y ® f(x)) m \<l>)r, (2.26) 

where the additional r Qbits not only are unentangled with the input 
and output registers, but also have a state \<p) r that is independent of 
the initial state of the input and output registers. 

Because is linear on the whole (n + m + r)-Qbit subspace, and 
because \i//) r and \<p) r are independent of the initial computational- 
basis state of the input and output registers, it follows that if the 
input and output registers are initially assigned any superposition 
of computational-basis states, then W f leaves them with a definite 
final state, which is related to their initial state by precisely the unitary 
transformation U/ of (2.1). 


Fig 2.3 


We can get the 
action of U/, when it is 
preceded and followed by 
Hadamards on both Qbits, 
by applying the appropriate 
identities of Figure 2.2 to 
the diagrams of Figure 2.1. 
Case 00 is unchanged 
because of Figure 2.2(a). In 
case 01 the target and 
control Qbits of the cNOT 
are interchanged because of 
Figure 2.2(f). The form in 
case 10 follows from the 
corresponding form in 
Figure 2.1 because of 
Figures 2.2(f) and 2.2(b). 
The form in case 11 follows 
from Figures 2.2(a) and 
2.2(b). If the initial state of 
the output register (lower 
wire) is 11) and the initial 
state of the input register 
(upper wire) is either of the 
two computational-basis 
states, then the initial state 
of the input register will be 
unchanged in cases 00 and 
11, and flipped in cases 01 
and 10, so by measuring 
the input register after the 
action of 

(H 0 H)U/(H 0 H) one 
can determine whether or 
not /(0) = /(1). 
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Fig 2.4 


A schematic representation of the standard unitary 
transformation U y for evaluating a function f taking a number 
0 < x < 2 n into a number 0 < m < 2 m . The heavy horizontal lines 
(bars) represent multiple-Qbit inputs. In order for the computation to 
be reversible even when f is not one-to-one, two multi-Qbit registers 
must be used. 
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Fig 2.5 


A more realistic picture of the computation represented in 
Figure 2.4. Many additional Qbits may be needed to carry out the 
calculation. These are represented by an r-Qbit bar in addition to the 
n- and m-Qbit bars representing the input and output registers. 

The computation is actually executed by a unitary transformation 
W f that acts on the larger space of all n + m + r Qbits. The 
representation of Figure 2.4 is correct only if the action of this larger 
unitary transformation W f on the input and output registers 
alone can be represented by a unitary transformation U y. This 
will be the case if the action of on the residual r Qbits is to take 
them from an initial pure state |i \r) r to a final pure state \(j)) r that 
is independent of the initial contents of the input and output 
registers. 


Therefore we can indeed use (2.1), ignoring complications associated 
with the additional r Qbits needed to compute the function /, if both 
the initial and the final states of the additional Qbits are independent 
of the initial states of the input and output registers. Independence of 
the initial states can be arranged by initializing the additional r Qbits 
to some standard state, for example |^) r = |0) r . A standard final state 
| (j)) r of the r Qbits, which is, in fact, identical to their initial state |^) r , 
can be produced by taking appropriate advantage of the fact that unitary 
transformations are reversible. 

We do the trick in three stages. 

(1) Begin the computation by applying a unitary transformation V that 
acts only on the n-Qbit input register and the r additional Qbits, 
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doing nothing to the output register. Because there is no action on 
the output register, the n + r Qbits on which V acts continue to 
have a state of their own. If the initial state of the input register is 
\x) n the unitary transformation V is designed, using standard tricks 
of reversible classical computation (about which we shall have more 
to say in Section 2.6) to construct f(x) in an appropriate m-Qbit 
subset of the n + r Qbits, given x in the input register. 

(2) Next change the y initially in the output register to y © /(v), 
as (2.1) or (2.26) specifies, without altering the state of the n + r 
other Qbits. This can be done with m cNOT gates that combine 
to make up a unitary transformation C m . The m control Qbits are 
those among the n + r that represent the result of the computation 
f(x); the m target Qbits are the ones in the corresponding positions 
of the output register. 

(3) Since the state of the n + r Qbits is not altered by the application 
of C m , we can finally apply to them the inverse transformation 

to restore them to their original state. We have thus produced the 
required unitary transformation W in (2.26), with the final state 
| (p) r of the r additional Qbits being identical to their initial state 
| \l /) r . This whole construction is illustrated by the circuit diagrams 
of Figures 2.4-2.7. 

The need for this, or some equivalent procedure, negates some of the 
hype one sometimes encounters in discussions of Deutsch’s problem. 
It is sometimes said that by using a quantum computer one can learn 
whether or not f(x) = f(y ) in no more time than it takes to perform a 
single evaluation of /. This is true only under the first, arithmetically 
uninteresting, interpretation of Deutsch’s problem. If, however, one is 
thinking of f as a function of mathematical interest evaluated by an 
elaborate subroutine, then to evaluate f for a single value of x there is 
no need to undo the effect of the unitary transform V on the additional 
registers. But for the trick that determines whether or not f(x) = f(y) 
it is absolutely essential to apply to undo the effect of V. This doubles 
the time of the computation. 

This misrepresentation of the situation is not entirely dishonor¬ 
able, however, since in almost all other examples the speed-up is by 
considerably more than a factor of two, and the necessary doubling of 
computational time is an insignificant price to pay. We turn immediately 
to an elementary example. 


Fig 2.6 


A more detailed 
view of the structure of the 
unitary transformation W f 
of Figure 2.5. Algebraically, 
Wy = V t yC m Vy. First a 
unitary transformation V f 
acts only on the w-Qbit 
input register and r 
additional Qbits, acting as 
the identity on the output 
register. This 
transformation takes the 
n + r Qbits into a state in 
which an m-Qbit subset 
represents the result of the 
calculation, f(x). Second, 
m controlled-NOT 
transformations (described 
in more detail in Figure 
2.7) act only on the m Qbits 
representing f(x) and the 
m Qbits of the output 
register, leaving the former 
m unchanged but changing 
the number represented by 
the output register from y 
to y ® f(x). Finally, the 
inverse Vy- of V/ is applied 
to the n + r Qbits on the 
top two bars, to restore 
them to their 

(unentangled) initial states. 
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Fig 2.7 


A more detailed picture of the C m unitary transformation in 
Figure 2.6, for the case m — 5. Each of the input and output bars 
contains five Qbits, represented by sets of five thin lines (wires). Five 
different 2-Qbit controlled-NOT gates link the five upper wires 
representing f(x) to the five lower wires representing the 
corresponding positions in the output register. The action of a single 
such cNOT gate is shown in the lower part of the figure. Note the 
alternative convention for a cNOT gate: the black dot on the wire 
representing the control Qbit is connected by a vertical line to an open 
circle on the wire representing the target Qbit. The other convention 
(used above in Figures 2.1-2.3) replaces the open circle by a square 
box containing the NOT operator X that may act on the target Qbit. 
The advantages of the circle representation are that it suggests the 
symbol ® that represents the XOR operation, and that it is easier to 
draw quickly on a blackboard. The advantages of using X are that it 
makes the algebraic relations more evident when NOT operations X, 
Z operations, or controlled-Z operations also appear, and that it 
follows the form used for all other controlled unitaries. 


2.4 The Bernstein-Vazirani problem 

Like many of the examples discovered before Shor’s factoring algo¬ 
rithm, this has a somewhat artificial character. Its significance lies not 
in the intrinsic arithmetical interest of the problem, but in the fact that 
it can be solved dramatically and unambiguously faster on a quantum 
computer. 

Let a be an unknown non-negative integer less than 2”. Let f{x) 
take any other such integer x into the modulo-2 sum of the products of 
corresponding bits of a and a 1 , which we denote by a • x (in recognition 
of the fact that it is a kind of bitwise modulo-2 inner product): 


a • x = ci{)Xo © a\X\ © aixi • • •. 


(2.27) 
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Suppose that we have a subroutine that evaluates f(x) = a-x. How 
many times do we have to call that subroutine to determine the value 
of the integer a ? Here and in all subsequent examples, we shall assume 
that any Qbits acted on by such subroutines, except for the Qbits of 
the input and output registers, are returned to their initial state at the 
end of the computation, as discussed in Section 2.3. 

The m th bit of a is a • 2 m , since the binary expansion of 2 m has 1 in 
position m and 0 in all the other positions. So with a classical computer 
we can learn the n bits of a by applying f to the n values x = 2 m , 0 < 
m < n. This, or any other classical method one can think of, requires n 
different invocations of the subroutine. But with a quantum computer 
a single invocation is enough to determine a completely, regardless of 
how big n is! 

I first describe the conventional way of seeing how this can be done, 
and then describe a much simpler way to understand the process. The 
conventional way exploits a trick (implicitly exploited in our solution to 
Deutsch’s problem) that is useful in dealing with functions like f that 
act on n Qbits with output to a single Qbit. If the 1-Qbit output register 
is initially prepared in the state HX|0) = H11) = (l/\/2)(|0) — |1)) 
then, since U/ applied to the computational basis state \x) n \y)\ flips 
the value y of the output register if and only if f(x) = 1, we have 


U/|*}„^(|0) - |1» = (—1) /W |*) B ^(|0) - |1>). (2.28) 

So by taking the state of the 1-Qbit output register to be (1/V2)(|0) — 
ID) , we convert a bit flip to an overall change of sign. This becomes 
useful because of a second trick, which exploits a generalization of the 
action (2.5) of H® w on |0) w . 

The action of H on a single Qbit can be compactly summarized as 


H|x)i 


T(|0) + ( -irii)) 


i 


l 

£(-D-ly>. 


y =o 


(2.29) 


If we apply H®" to an n-Qbit computational-basis state \x) n we can 
therefore express the result as 

1 1 1 

H ®"l *)n = ¥ J- 2 £ ' • • £( I)-'"’'' bn-x) ' • • |J0> 

y„-i=0 j/o=0 

= V (2-30) 

y =o 


where the product x • y is the one defined in (2.27). (Because —1 is 
raised to the power ^ x 7 j/y, all that matters about the sum is its value 
modulo 2.) 

So if we start with the n-Qbit input register in the standard initial 
state H 0;/ 10), put the 1-Qbit output register into the state H| 1), apply 




52 


GENERAL FEATURES AND SOME SIMPLE EXAMPLES 


Fig 2.8 


An illustration of a 
circuit that implements the 
unitary subroutine U f 
taking w-Qbit input and 
1-Qbit output registers, 
initially in the state 

\x) n \y) l, into 
\x)n\y © f(x))u where 

f(x) = a • x = a j x j 

(mod 2). The Bernstein- 
Vazirani problem asks us to 
determine all the bits of a 
with a single invocation of 
the subroutine. In the 
illustration n = 5 and 
a = 25 = 11001. For 
j = 0, — 1, each of 

the cNOT gates adds 1 
(mod 2) to the output 
register if and only if 
dj Xj = 1. In addition to 
their normal labeling with 
the 1-Qbit states they 
represent, the wires of the 
input register are labeled 
with the bits of a , to make 
it clear which (those 
associated with a j = 1) act 
as control bits for a cNOT 
targeted on the output 
register. 


« 4 = 1 
« 3=1 

a =0 

2 

o x - 0 

a — 1 

o 




U/, and then again apply H®” to the input register, we get 


(H®" ® ® H)|0)„|l)i 

( 2 n l \ 


i / 2 n — \ 

- 2 H“£(-l) 

\ x=0 


f{x) 


1 


1 

2 n 


2 ” — 1 2 " — 1 



x=0 0 


x)]— (|0>-|1» 


(2.31) 


We do the sum over x first. If the function f(x) is a • x then this 
sum produces the factor 


2 n — \ n 1 

7 = 1 x;= 0 


(2.32) 


=o 


At least one term in the product vanishes unless every bit yj of y is 
equal to the corresponding bit a, of a — i.e. unless y = a. Therefore 
the entire computational process (2.31) reduces to 


H® (h+1) U / H® (b+1) |0) b |1}i = |fl)«|l)i, (2.33) 

where I have applied a final H to the 1-Qbit output register to make the 
final expression look a little neater and more symmetric. (I have also 
restored subscripts to the state symbols for the n-Qbit input register 
and 1-Qbit output register.) 

So by putting the input and output registers into the appropriate 
initial states, after a single invocation of the subroutine followed by an 
application of H®” to the input register, the state of the input register 
becomes | a). As promised, all n bits of the number a can now be 
determined by measuring the input register, even though we have called 
the subroutine only once! 

There is a second, complementary way to look at the Bernstein- 
Vazirani problem that bypasses all of the preceding analysis, making 
it evident why (2.33) holds, by examining a few circuit diagrams. The 
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a 4 = 1 

a 3= 1 

a 2 = 0 
a { = 0 


0 > 

0 > 

0 > 

0 > 

0 > 




idea is to note, just as we did for the black box of Deutsch’s problem 
in (2.14), that the actions of the black boxes that implement Uy for 
the different available choices of f are identical to the actions of some 
simple circuits. 

When f(x) = a • v, the action of Uy on the computational basis 
is to flip the 1-Qbit output register once, whenever a bit of x and the 
corresponding bit of a are both 1. When the state of the input register 
is | x) n this action can be performed by a collection of cNOT gates all 
targeted on the output register. There is one cNOT for each nonzero 
bit of a, controlled by the Qbit representing the corresponding bit of 
v. The combined effect of these cNOT gates on every computational- 
basis state is precisely that of Uy. Therefore the effect of any other 
transformations preceding and/or following Uy can be understood by 
examining their effect on this equivalent collection of cNOT gates, even 
though U f may actually be implemented in a completely different way. 

The encoding of a in the disposition of the equivalent cNOT gates 
is illustrated in Figure 2.8. The application (2.33) of H to every Qbit in 
the input and output registers both before and after the application of 
U/, pictured in Figure 2.9, converts every cNOT gate in the equivalent 
representation of Uy from Q 7 to (H* H 7 )Q 7 (H* H 7 ) = C /? , as pictured 
in Figure 2.10 (see also Equation (1.44).) After this reversal of target 
and control Qbits, the output register controls every one of the cNOT 
gates, and since the state of the output register is 11), every one of the 
NOT operators acts. That action flips just those Qbits of the input 
register for which the corresponding bit of a is 1. Since the input 
register starts in the state |0) w , this changes the state of each Qbit of 
the input register to 11), if and only if it corresponds to a nonzero bit 
of a. As a result, the state of the input register changes from |0)„ to 
| a) n , just as (2.33) asserts. 

Note how different these two explanations are. The first applies U y 
to the quantum superposition of all possible inputs, and then applies 
operations that lead to perfect destructive interference of all states in 
the superposition except for the one in which the input register is in the 
state | a). The second suggests a specific mechanism for representing 
the subroutine that executes U y and then shows that sandwiching such a 


Fig 2.9 


The solution to 
the Bernstein-Vazirani 
problem is to start with the 
input register in the state 
|0} w and the output register 
in the state 11) i and apply 
Hadamard transforms to all 
n + 1 registers before 
applying U y. Another 
n + 1 Hadamards are 
applied after U f has acted. 
The cNOT gates 
reproduce the action of 
U f , as shown in Figure 2.8. 
The conventional analysis 
deduces the final state by 
calculating the effect of the 
Hadamards on the initial 
state of the Qbits and on 
the state subsequently 
produced by the action of 
Uy. A much easier way to 
understand what is going 
on is to examine the effect 
of the Hadamards on the 
collection of cNOT gates 
equivalent to U y. This is 
shown in Figure 2.10. 
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Fig 2.10 


Sandwiching a 
cNOT gate between 
Hadamards that act on the 
control and target Qbits 
has the effect of 
interchanging control and 
target, as shown at the top 
of the figure. (See 
Equation (1.44) or Figure 
2.2.) Consequently the 
action of all the Hadamards 
in Figure 2.9 on the cNOT 
gates between them is 
simply to interchange the 
control and target Qbits, as 
shown in the lower part of 
the figure. In establishing 
this one uses the fact that 
H 2 = 1, so that the H gates 
on wires that are not 
control or target Qbits 
combine to give 1, and 
pairs of Hadamards can be 
introduced between every 
X on the lowest wire, 
converting HXXXH into 
(HXH)(HXH)(HXH). 

After the action of the 
Hadamards the cNOT 
gates are controlled by the 
output register, so if the 
output register is in the 
state 11) then all the X act 
on their input-register 
targets. If the initial state of 
the input register is |0) w 
then the effect of each X is 
to change to 11) the state of 
each Qbit associated with a 
bit of a that is 1. This 
converts the state of the 


input register to | a) 


n 




X 




mechanism between Hadamards automatically imprints a on the input 
register. 

Interestingly, quantum mechanics appears in the second method 
only because it allows the reversal of the control and target Qbits of 
a cNOT operation solely by means of 1-Qbit (Hadamard) gates. One 
can also reverse control and target bits of a cNOT classically, but this 
requires the use of 2-Qbit SWAP gates, rather than 1-Qbit Hadamards. 
You can confirm for yourself that this circuit-theoretic solution to the 
Bernstein-Vazirani problem no longer works if one tries to replace all 
the Hadamard gates by any arrangement of SWAP gates. 


2.5 Simon's problem 

Simon’s problem, like the Bernstein-Vazirani problem, has an n -bit 
nonzero number a built into the action of a subroutine U/, and the aim 
is to learn the value of a with as few invocations of the subroutine as 
possible. In the Bernstein-Vazirani problem a classical computer must 
call the subroutine n times to determine the value of a , while a quantum 
computer need call the subroutine only once. The number of calls grows 
linearly with n in the classical case, while being independent of n in 
the quantum case. In Simon’s problem the speed-up with a quantum 
computer is substantially more dramatic. With a classical computer the 
number of times one must call the subroutine grows exponentially in 
w, but with a quantum computer it grows only linearly. 

This spectacular speed-up involves a probabilistic element charac¬ 
teristic of many quantum computations. The characterization of how 
the number of calls of the subroutine scales with the number of bits in a 
applies not to calculating a directly, but to learning it with probability 
very close to 1. 
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The subroutine U f in Simon’s problem evaluates a function f on 
n bits that is two to one - i.e. it is a function from n to n — 1 bits. It is 
constructed so that f(x) = f(y) if and only if the n- bit integers x and 
y are related by x = y © a or, equivalently and more symmetrically, 
x © y = a, where © again denotes bitwise modulo-2 addition. One 
can think of this as a period-finding problem. One is told that f is 
periodic under bitwise modulo-2 addition, 


fix © a) = fix) 


(2.34) 


for all v, and the problem is to find the period a. Simon’s problem is 
thus a precursor of Shor’s much subtler and spectacularly more useful 
period-finding algorithm - the heart of his factoring procedure - where 
one finds the unknown period a of a function that is periodic under 
ordinary addition: f(x + a) = f(x). 

To find the value of a in (2.34) with a classical computer all you can 
do is feed the subroutine different x\, xi, ^ 3 , ..., listing the resulting 
values of f until you stumble on an Xj that yields one of the previously 
computed values f{xi). You then know that a = Xj © x t . At any stage 
of the process prior to success, if you have picked m different values 
of v, then all you know is that a 7 ^ Xj © Xj for all pairs of previously 
selected values of x. You have therefore eliminated at most ^ m(m — 1) 
values of a . (You would have eliminated fewer values of a if you were 
careless enough to pick an x equal to x r © Xj © Xf for three values 
of v already selected.) Since there are 2” — 1 possibilities for a, your 
chances of success will not be appreciable while \m(m — 1 ) remains 
small compared with 2 n . You are unlikely to succeed until m becomes 
of the order of 2”/ 2 , so the number of times the subroutine has to be 
run to give an appreciable probability of determining a grows with the 
number of bits n as 2 W//2 - i.e. exponentially. If a has 100 bits a classical 
computer would have to run the subroutine about 2 50 ^ 10 15 times 
to have a significant chance of determining a . At ten million calls per 
second it would take about three years. 

In contrast, a quantum computer can determine a with high prob¬ 
ability (say less than one chance in a million of failing) by running the 
subroutine not very much more than n times - e.g. with about 120 
invocations of the subroutine if a has 100 bits. This remarkable feat 
can be accomplished with the following strategy. 

We return to the standard procedure and apply the unitary transfor¬ 
mation U f only after the state of the input register has been transformed 
into the uniformly weighted superposition (2.5) of all possible inputs 
by the application of H®”, so that the effect of U/ is to assign to the 
input and output registers the entangled state 



(2.35) 
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If we now subject only the output register to a measurement, then the 
measurement gate is equally likely to indicate each of the 2 n ~ l different 
values of /. Since each value of f appears in two terms in (2.35) that 
have the same amplitudes, the generalized Born rule tells us that the 
input register will be left in the state 

^(|x 0 ) + |x 0 ® a)) (2.36) 

for that value of xo for which f(x o) agrees with the random value of f 
given by the measurement. 

At first glance this looks like great progress. We have produced a 
superposition of just two computational-basis states, associated with 
two n -bit integers, that differ (in the sense of ©) by a . If we knew those 
two integers their bitwise modulo 2 sum would be a . But unfortunately, 
as already noted, when a register is in a given quantum state there is in 
general no way to learn what that state is. To be sure, if we could clone 
the state, then by measuring a mere ten copies of it in the computational 
basis we could with a probability of about 0.998 learn both vo and vo © a 
and therefore a itself. But unfortunately, as we have also noted earlier, 
one cannot clone an unknown quantum state. Nor does it help to run 
the algorithm many times, since we are overwhelmingly likely to get 
states of the form (2.36) for different random values of xo. By subjecting 
(2.36) to a direct measurement all we can learn is either xq - a random 
number, or vo ® a — another random number. The number a that we 
would like to know appears only in the relation between two random 
numbers, only one of which we can learn. 

Nevertheless, as in Deutsch’s problem, if we renounce the possibility 
of learning either number (which alone is of no interest at all), then 
by applying some further operations before measuring we can extract 
some useful partial information about their relationship - in this case 
their modulo-2 sum a. With the input register in the state (2.36), we 
apply the n -fold Hadamard transformation H® w . Equation (2.30) then 
gives 

1 1 2 *~ l 
H® B —(|xo> + |xo©«>) = T-^y((- \r-y + {-\p® a) -y)\y). 

V 2 1 j/=o 

(2.37) 

Since (-l)ffi>©«)-J' = (-1 )'«'>'(-1 )"'>', the coefficient of | y) in (2.37) is 
0 if a • y = 1 and 2(— \) x °' y if a • y = 0. Therefore (2.37) becomes 

(238) 

a-y=§ 

where the sum is now restricted to those y for which the modulo-2 
bitwise inner product a • y is 0 rather than 1. So if we now measure the 
input register, we learn (with equal probability) any of the values of y 
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for which a • y = 0 - i.e. for which 

n — 1 

j /jUi = 0 (mod 2), (2.39) 

i=0 

where a t and y t are corresponding bits in the binary expansions of a 
and y. 

This completes our description of the quantum computation: with 
each invocation of U/ we learn a random y satisfying a • y = 0. What 
remains is the purely mathematical demonstration that this information 
enables us to determine a with high probability with not many more 
than n invocations. To see that this is plausible, note first that with just 
a single invocation of U/, unless we are unlucky enough to get y = 0 
(which happens with the very small probability 1/2” -1 ), we learn a 
nonzero value of y, and therefore a nontrivial subset of the n bits of 
a whose modulo-2 sum vanishes. One of those bits is thus entirely 
determined by the others in the subset, so we have cut the number of 
possible choices for a in half, from 2 n — 1 (the — 1 reflecting the fact that 
we are told that a ^ 0) to 2 n ~ x — 1. In one invocation of the subroutine 
we can, with very high probability, eliminate half the candidates for a ! 
(Contrast this to the classical case, in which a single invocation of U/ 
can tell us nothing whatever about a.) 

If we now repeat the whole procedure, then with very high proba¬ 
bility the new value of y that we learn will be neither 0 nor the same 
as the value we learned the first time. We will therefore learn a new 
nontrivial relation among the bits of a, which enables us to reduce the 
number of candidates by another factor of 2, eliminating three quarters 
of the possibilities available for a with two invocations of the subrou¬ 
tine. (Compare this to the classical situation in which only a single value 
of a can be removed with two invocations.) 

If every time we repeat the procedure we have a good chance of 
reducing the number of choices for a by another factor of 2, then with n 
invocations of the subroutine we might well expect to have a significant 
chance of learning a . This intuition is made precise in Appendix G, 
where some slightly subtle but purely mathematical analysis shows that 
with n + v invocations of U/ the probability q of acquiring enough 
information to determine a is 





1 


r yn J cX —1 



l 

2^+2 


1 

2^+i 

(2.40) 


Thus the odds are more than a million to one that with n + 20 invoca¬ 
tions of U/ we will learn a, no matter how large n may be. 

The intrusion of some mildly arcane arithmetic arguments, to con¬ 
firm that the output of the quantum computer does indeed provide 
the needed information in the advertised number of runs, is charac¬ 
teristic of many quantum-computational algorithms. The action of the 
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quantum computer itself is rather straightforward, but we must engage 
in more strenuous mathematical exertions to show that the outcome 
of the quantum computation does indeed enable us to accomplish the 
required task. 


2.6 Constructing Toffoli gates 

As noted in Section 1.6, constraints on what is physically feasible 
limit us to unitary transformations that can be built entirely out of 
1- and 2-Qbit gates. It is assumed that 1-Qbit unitary gates will be rel¬ 
atively straightforward to make, though even this can be challenging for 
many of the physical systems proposed for Qbits. Making 2-Qbit gates 
presents an even tougher challenge to the quantum-computational en¬ 
gineer, since they will require one to manipulate with precision the 
physical interaction between the two Qbits. Making an inherently 3- 
Qbit gate goes beyond present hopes. 

It has been known since before the arrival of quantum computa¬ 
tion that to build up all arithmetical operations on a reversible classical 
computer it is necessary (and sufficient) to use at least one classically 
irreducible 3-Qbit gate - for example controlled-controlled-NOT (cc- 
NOT) gates, known as Toffoli gates. Such 3-Qbit gates cannot be built 
up out of 1- and 2-Cbit gates. This would appear to be bad news for 
the prospects of practical quantum computation. 

Remarkably, however, the linear extension of the Toffoli gate to 
Qbits can be built up out of 1-Qbit unitary gates acting in suitable 
combination with 2-Qbit cNOT gates. The quantum extension of this 
classically irreducible 3-Cbit gate can be realized with a rather small 
number of 1- and 2-Qbit gates. 

The 3-Qbit Toffoli gate T acts on the computational basis to flip 
the state of the third (target) Qbit if and only if the states of both of the 
first two (control) Qbits) are 1: 

T\x)\y)\z) = \x)\y)\z® xy). (2.41) 

Since T is its own inverse, it is clearly reversible, and therefore its linear 
extension from the classical basis to arbitrary 3-Qbit states is unitary, 
by the general argument in Section 1.6. 

The Toffoli gate enables one to calculate the logical AND of two bits 
(i.e. their product) since T|v)|j/)|0) = \x)\y)\xy). Since all Boolean 
operations can be built up out of AND and NOT, and since all of 
arithmetic can be constructed out of Boolean operations, with Tof¬ 
foli gates one can build up all of classical computation through re¬ 
versible operations. (One can even produce NOT with a Toffoli gate: 
T|l)|l)|v) = |l)|l)|v), but this would be a ridiculously hard way to 
implement NOT on a quantum computer.) 

There are (at least) two rather different ways to construct a ccNOT 
gate T out of cNOT gates and 1-Qbit unitaries. The first way to be 
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found requires eight cNOT gates. Later a more efficient construction 
was discovered that requires only six cNOT gates. Nobody has found 
a construction with fewer than six cNOT gates, but I do not know 
of a proof that six are required. I describe both constructions, since 
they take advantage of and therefore illustrate several useful quantum- 
computational tricks. 

The construction of a Toffoli gate from eight cNOT gates is based 
on three ingredients, (a) For any 1-Qbit unitary U one defines the 2- 
Qbit controlled- U gate as one that acts on the computational basis 
as the identity if the state of Qbit 1 (the control Qbit) is |0) and acts on 
Qbit 0 (the target Qbit) as U if the state of the control Qbit is 11): 

CTokixo) = Uq 1 \x\Xq). (2.42) 

(The cNOT operation C is thus a C A operation, but so important a 
one as to make it the default form when no U is specified.) We shall 
show that a controlled- U gate for arbitrary U can be built out of two 
cNOT gates and 1-Qbit unitaries. (b) We shall show that a 3-Qbit 
doubly-controlled- U 2 gate, which takes \xix\Xo) into (Uq X2Xi )\x 2 X\Xo), 
can be constructed out of two controlled- U gates, one controlled- 
gate, and two additional cNOT gates, making a total of eight cNOT 
gates, (c) We shall show that there is a unitary square-root-of-NOT 
gate, \/X. Taking U in (b) to be ^/X gives the desired Toffoli gate. We 
now elaborate on each part of the construction. 

(a) Let V and W be two arbitrary 1-Qbit unitary transformations 
and consider the product 

VoCioVjWoCioWj. (2.43) 

One easily confirms that (2.43) acts on the computational basis as C[q 
with 

U = (VXV f )(WXW f ) = (V(lt • ~&)V f )(W(lt • _ ^)W t ). (2.44) 

As shown in Appendix B, one can pick V and W so that 

(V(lt • _ ^)V t )(W (Jt • _ ^)W t ) 

= (~t • ~&)(t . -&) = ~t • ~t + i(~t ( g> T) • (2.45) 

for arbitrary unit vectors it and ~t . Appendix B also establishes that 
any 1-Qbit unitary transformation has, to within a multiplicative nu¬ 
merical phase factor e ia , the form 

u(lt,Q) = exp {i\0lt • -&) = cos(|0) 1 + i sin(^0) it • tfL 

(2.46) 

If a and b are in the plane perpendicular to n and the angle between them 
is ^0, then U = u(n, 6). The 1-Qbit unitary transformation E = e ian , 
applied to Qbit 1, multiplies by the phase factor e l0L if and only if the 
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Fig 2.11 


How to construct 
a controlled -U gate C u 
from unitary 1-Qbit gates 
and two controlled-NOT 
gates. If the control Qbit is 
in the state |0), the 
operations on the target 
wire combine to give 
(VV^WW 1 ) = 1. But if 
the control Qbit is in the 
state 11) then the 
operations combine to give 
u = (V<x,Vt) (Wo-.Wt), 
where cr x = it it = X. 
To within an overall 
numerical phase factor a 
general two-dimensional 
unitary transformation can 
always be put in this form 
for appropriate V and W. 
The E on the control wire 
is the unitary 
transformation 

F — p iM — ( 1 0 

\0 e ia 

which supplies such a 
phase factor when the state 
of the control Qbit is 11). 
The two unitary gates 
between the cNOT gates 
on the lower wire, W and 
V^, can be combined into 
the single unitary gate 
v f w, so in addition to the 
two cNOT gates the 
construction uses four 
1-Qbit unitaries. 



computational-basis state of Qbit 1 is |1). The resulting circuit for 
constructing C u is shown in Figure 2.11. 

(b) Given such a controlled -U gate Cf- with Qbit i the control and 

j the target, a doubly-controlled- U 2 gate, controlled by Qbits 2 and 1 
and targeting Qbit 0, can be constructed out of three such controlled- U 
gates and two more cNOT gates: 

C u2 = Cf 0 C 2] C[oC 21 C‘ 0 . (2.47) 


The corresponding circuit diagram is shown in Figure 2.12. It is 
straightforward to establish that the sequence of operators on the right 
of (2.47) acts on the 3-Qbit computational-basis states as 1 unless Qbits 
2 and 1 are both in the state 11), in which case it acts on Qbit 0 as U 2 . 
(c) Finally, note that 



(2.48) 


which is clearly unitary. Therefore, since X = HZH and H 2 = 1, we 
have 


Vx = hVzh. 


(2.49) 


This plays the role of U in (b) to make the Toffoli gate. 

The alternative construction of the Toffoli gate that uses only six 
cNOT gates has an action that is somewhat more transparent. It is 
illustrated in Figure 2.13. If A and B are any two unitaries with A = 
B 2 = 1 then the 3-Qbit gate 


r*B f*A r* B r* A 

^ 10 ^20 ^ 10 ^20 


(2.50) 


clearly acts as the identity on the computational basis, unless the 
states of Qbits 1 and 2 are both 11), in which case it acts as (BA) 2 
on Qbit 0, so it is a doubly-controlled-(2?^4) 2 gate. Take A = ~~ct • 7? 
and B = b • for unit vectors it and b . Since it • it can be 
expressed as V' • 7^)V = V^XV for appropriate unitary V, each 
controlled-^ and controlled-2? gate can be constructed with a single 
controlled-NOT gate and 1-Qbit unitaries. The product BA is the 
unitary ( b -7?)-7?) = {b • it)1 + i(b x ) • it. Pick the 


unit vectors ~t and ~ct with the angle between them 7 t/ 4, lying in 
the plane perpendicular to x with their vector product directed along 
, so that ( b -7?) (7f • 7^) = cos(7T/4)1 + i sin(7T/4 )lt • it. (For 
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Fig 2.12 


How to construct a 3-Qbit controlled-controlled-£/ 2 gate 
from 2-Qbit controlled-NOT, controlled- £/, and controlled -Lft gates. 
If Qbits 2 and 1 (top and middle wires) are both in the state 11) then U 
acts twice on Qbit 0 (bottom wire) but does not. If Qbits 2 and 1 are 
both in the state |0) nothing acts on Qbit 0. If Qbits 2 and 1 are in the 
states 11) and |0) then only the U on the left and the act on Qbit 0 
(and their product is 1), and if Qbits 2 and 1 are in the states |0) and 
11) only the U on the right and the act on Qbit 0 (and their product 
is again 1.) 




Fig 2.13 


How to make a doubly-controlled-NOT (Toffoli) gate using 
six cNOT gates and 1-Qbit unitaries. The unitary operators A and B 
are given by A = ~1t it and B = b • it for appropriately chosen 
real unit vectors it and b . Because it • 7? = • 7?)V for 

appropriate unitary V, each controlled-^ and controlled-2? gate can be 
constructed with a single controlled-NOT gate and 1-Qbit unitaries. 
Because A = B = 1, the controlled-^ and controlled- 2? gates act 
together as a doubly-controlled-(BA) 2 gate. One can pick the 
directions ~t and b so that {BA) 1 = iX. The controlled- U gate on 
the right corrects for this unwanted factor of i. Here U is the 1-Qbit 
unitary Since any controlled-U gate can be constructed with 

two cNOT gates and 1-Qbit unitaries, this adds two more cNOT gates 
to the construction, making a total of six. 


example take b = it and it = {\/^/l){~t — ~j^)-) Then (BA) 2 = 
cos(7r/2)l + i sin(7r/2)T? • 7? = i -£•-&= iX. 

Thus (2.50) produces a doubly-controlled-NOT gate except for an 
extra factor of i accompanying the NOT We can correct for this by 
applying an additional gate, where U is the 1-Qbit unitary e ~ l (7r / 2)n . 
This controlled- U gate acts as the identity on the computational basis 
unless the states of Qbits 2 and 1 are both 11), in which case it multiplies 
the state by e~ tn ^ — — i, thereby getting rid of the unwanted factor 
of i. Since we have just established that any controlled- U gate can be 
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constructed with two cNOT gates and 1-Qbit unitaries, correcting the 
phase adds two more cNOT gates to the construction, making a total 
of six. 

Alternatively, one can view this as a way to construct a Toffoli gate 
from four cNOT gates and a single controlled-phase gate of precisely 
the kind that plays a central role in the quantum Fourier transform de¬ 
scribed in Chapter 3. If quantum computation ever becomes a working 
technology, it might well be easier to construct controlled-phase gates 
as fundamental gates in their own right - pieces of 2-Qbit hardware as 
basic as cNOT gates. 

As this and subsequent examples reveal, the cNOT gate is of funda¬ 
mental importance in quantum computation. Appendix H gives some 
examples of how such gates might actually be realized. That appendix is 
addressed primarily to physicists, but readers with other backgrounds 
might find it an interesting illustration of the rather different questions 
that arise when one starts thinking about how actually to produce some 
of the basic quantum-computational hardware. 



Chapter 3 

Breaking RSA encryption 


3.1 Period finding, factoring, and cryptography 

Simon’s problem (Section 2.5) starts with a subroutine that calculates 
a function /(a 1 ), which satisfies f(x) = f(y) for distinct a 1 and y if and 
only if y = x ® a, where © denotes the bitwise modulo-2 sum of the 
n -bit integers a and a 1 . The number of times a classical computer must 
invoke the subroutine to determine a grows exponentially with n , but 
with a quantum computer it grows only linearly 

This is a rather artificial example, of interest primarily because it 
gives a simple demonstration of the remarkable computational power 
a quantum computer can possess. It amounts to finding the unknown 
period a of a function on n- bit integers that is “periodic” under bitwise 
modulo-2 addition. A more difficult, but much more natural problem 
is to find the period r of a function f on the integers that is periodic 
under ordinary addition, satisfying f(x) = f(y) for distinct x and y 
if and only if x and y differ by an integral multiple of r. Finding the 
period of such a periodic function turns out to be the key to factoring 
products of large prime numbers, a mathematically natural problem 
with quite practical applications. 

One might think that finding the period of such a periodic function 
ought to be easy, but that is only because when one thinks of periodic 
functions one tends to picture slowly varying continuous functions 
(like the sine function) whose values at a small sample of points within 
a period can give powerful clues about what that period might be. But 
the kind of periodic function to keep in mind here is a function on the 
integers whose values within a period r are virtually random from one 
integer to the next, and therefore give no hint of the value of r . 

The best known classical algorithms for finding the period r of such a 
function take a time that grows faster than any power of the number n of 
bits of r (exponentially with n 1 / 3 ). But in 1994 Peter Shor discovered 
that one can exploit the power of a quantum computer to learn the 
period r, in a time that scales only a little faster than n 3 . 

Because the ability to find periods efficiently, combined with some 
number-theoretic tricks, enables one to factor efficiently the prod¬ 
uct of two large prime numbers, Shor’s discovery of super-efficient 
quantum period finding is of considerable practical interest. The very 
great computational effort required by all known classical factorization 
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techniques underlies the security of the widely used RSA 1 method of 
encryption. Any computer that can efficiently find periods would be 
an enormous threat to the security of both military and commercial 
communications. This is why research into the feasibility of quantum 
computers is a matter of considerable interest in the worlds of war and 
business. 

Although the elementary number-theoretic tricks that underlie the 
RSA method of encryption have nothing directly to do with how a quan¬ 
tum computer finds periods, they motivate the problem that Shor’s 
quantum-computational algorithm so effectively solves. Furthermore, 
examining the number-theoretic basis of RSA encryption reveals that 
Shor’s period-finding algorithm can be used to defeat it directly, with¬ 
out any detour into factoring. We therefore defer the number-theoretic 
connection between period finding and factoring to Section 3.10. If 
you are interested only in applying Shor’s period-finding algorithm 
to decoding RSA encryption, Section 3.10 can be skipped. If you are 
not interested in the application of period finding to commerce and 
espionage, you can also skip the number theory in Sections 3.2 and 3.3 
and go directly to the quantum-computational part of the problem - 
super-efficient period finding - in Section 3.4. 


3.2 Number-theoretic preliminaries 

The basic algebraic entities behind RSA encryption are finite groups, 
where the group operation is multiplication modulo some fixed integer 
TV. In modulo- TV arithmetic all integers that differ by multiples of TV 
are identified, so there are only TV distinct quantities, which can be 
represented by 0, 1, ..., TV — 1. For example 5x6 = 2 (mod 7) since 
5 x 6 = 30 = 4 x 7 + 2. One writes = (mod TV) to emphasize that the 
equality is only up to a multiple of TV, reserving = for strict equality. 
One can develop the results that follow using arithmetic rather than 
group theory, but the group-theoretic approach is simpler and uses 
properties of groups so elementary that they can be derived from the 
basic definitions in hardly more than a page. This is done in Appendix 
I, which readers unacquainted with elementary group theory should 
now read. 

Let Gn be the set of all positive integers less than TV (including 1) 
that have no factors in common with TV. Since factoring into primes is 
unique, the product of two numbers in G jy (either the ordinary or the 


1 Named after the people who invented it in 1977, Ronald Rivest, Adi Shamir, 
and Leonard Adleman. RSA encryption was independently invented by 
Clifford Cocks four years earlier, but his discovery was classified top secret 
by British Intelligence and he was not allowed to reveal his priority until 
1997. For this and other fascinating tales about cryptography, see Simon 
Singh, The Code Book , New York, Doubleday (1999). 
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modulo- TV product) also has no factors in common with TV, so Gn is 
closed under multiplication modulo TV. If 0, b, and c are in with 
ab = ac (mod TV), then a{b — c) is a multiple of TV, and since a has 
no factors in common with TV, it must be that b — c is a multiple of TV, 
so b = c (mod TV). Thus the operation of multiplication modulo by 
a fixed member a of G^ takes distinct members of G ^ into distinct 
members, so the operation simply permutes the members of the finite 
set G at. Since 1 is a member of G there must be some d in G^ 
satisfying ad — 1 - i.e. a must have a multiplicative inverse in G 
Thus Gn satisfies the conditions, listed in Appendix I, for it to be a 
group under modulo- TV multiplication. 

Every member a of a finite group G is characterized by its order k , 
the smallest integer for which (in the case of Gn) 

a k = \ (mod N). (3.1) 

As shown in Appendix I, the order of every member of G is a divisor of 
the number of members of G (the order of G). If p is a prime number, 
then the group G p contains p — 1 numbers, since no positive integer 
less than p has factors in common with p . Since p — 1 is then a multiple 
of the order k of any a in G p , it follows from (3.1) that any integer a 
less than p satisfies 


a p ~ x = 1 (mod p). (3.2) 

This relation, known as Fermat's little theorem , extends to arbitrary 
integers a not divisible by p, since any such a is of the form a = 
mp + a' with m an integer and a ' less than p . 

RS A encryption exploits an extension of Fermat’s little theorem to 
a case characterized by two distinct primes, p and q . If an integer a is 
divisible neither by p nor by q , then no power of a is divisible by either 
p or q. Since, in particular, a q ~ l is not divisible by p , we conclude 
from (3.2) that 


[a* -1 ]' -1 = 1 (mod p). (3.3) 

For the same reason 

IV -1 ]* -1 = 1 (mod*). (3.4) 

The relations (3.3) and (3.4) state that — 1 is a multiple both 

of p and of q . Since p and q are distinct primes, it must therefore be a 
multiple of pq, and therefore 

= i ( m od (3.5) 

(You are urged to check relations like (3.5) for yourself in special cases. 
If, for example p = 3 and q = 5 then (3.5) requires 2 8 — 1 to be divis¬ 
ible by 15, and indeed, 255 = 17 x 15.) 
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As an alternative derivation of (3.5), note that since a is divisible 
neither by p nor by q, it has no factors in common with pq and is 
therefore in G pq . The number of elements of G pq is pq — 1 — {p — 
1) — (q — 1) = (p — 1 )(q — 1), since there are pq — 1 integers less 
than pq, among which are p — 1 multiples of q and another distinct 
q — 1 multiples of p. Equation (3.5) follows because the order (p — 
1 )(q — 1) of G pq must be a multiple of the order of a. 

We get the version of (3.5) that is the basis for RSA encryption by 
taking any integral power 5 of (3.5) and multiplying both sides by a: 

a \+s{q-l){p-\) ^ a ( moc | p^y ( 3 . 6 ) 

(The relation (3.6) holds even for integers a that are divisible by p or 
q . It holds trivially when a is a multiple of pq . And if a is divisible by 
just one of p and q, let a = kq . Since a is not divisible by p neither 
is any power of a, and therefore Fermat’s little theorem tells us that 
= 1 + np for some integer n. On multiplying both sides 
by we have a 1+5 =a + nap = a + nkqp , so (3.6) continues 
to hold.) 

Note finally that if c is an integer having no factor in common with 
(p — \){q — 1) then c is in G( P - i)(^-i) and therefore has an inverse in 
i.e. there is a d in G( P - i)(^-i) satisfying 


cd = 1 (mod (p — 1 ){q — 1)). 

(3.7) 

So for some integer 5, 


cd = 1 +^(^ — 1 ){q — 1). 

(3.8) 

In view of (3.8) and (3.6), any integer a must satisfy 


a cd = a (mod pq). 

(3.9) 

So if 


b = a c (mod pq), 

(3.10) 

then 


b d = a (mod pq). 

(3.11) 


The elementary arithmetical facts summarized in this single paragraph 
constitute the entire basis for RSA encryption. 


3.3 RSA encryption 

Bob wants to receive a message from Alice encoded so that he alone can 
read it. To do this he picks two large (say 200-digit) prime numbers p 
and q . He gives Alice, through a public channel, their product N = pq 
and a large encoding number c that he has picked to have no factors 
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in common with 2 ( p — 1 ){q — 1). He does not, however, reveal the 
separate values of p and q and, given the practical impossibility of 
factoring a 400-digit number with currently available computers, he 
is quite confident that neither Alice nor any eavesdropper Eve will be 
able to calculate p and q knowing only their product TV. Bob, however, 
because he does know p and q, and therefore ( p — 1 ){q — 1), can find 
the multiplicative inverse d of c mod {p — 1 )(q — 1), which satisfies 
(3.7). 3 He keeps d strictly to himself for use in decoding. 

Alice encodes a message by representing it as a string of fewer than 
400 digits using, for example, some version of ASCII coding. If her 
message requires more than 400 digits she chops it up into smaller 
pieces. She interprets each such string as a number a less than TV. 
Using the coding number c and the value of N = pq she received 
from Bob, she then calculates b = a c (mod pq), and sends it on to Bob 
through a public channel. With c typically a 200-digit number, you 
might think that this would itself be a huge computational task, but it 
is not, as noted in Section 3.8. When he receives b, Bob exploits his 
private knowledge of d to calculate b d (mod pq), which (3.11) assures 
him is Alice’s original message a. 

Were the eavesdropper Eve able to find the factors p and q of TV, she 
could calculate (p — 1 )(q — 1) and find the decoding integer d from the 
publicly available coding integer c , the same way Bob did. But factoring 
a number as large as TV is far beyond her classical computational powers. 
Efficient period finding is of interest in this cryptographic setting not 
only because it leads directly to efficient factoring (as described in 
Section 3.10), but also because it can lead Eve directly to an alternative 
way to decode Alice’s message b without her knowing or having to 
compute the factors p and q of TV. Here is how it works: 

Eve uses her efficient period-finding machine to calculate the order 
r of Alice’s publicly available encoded message b = a c in 4 G Pq . Now 
the order r of Alice’s encoded message b = a c in G pq is the same 


2 As shown in Appendix J, the probability that two large random numbers 
have no common factor is greater than 4 , so such c are easily found. Whether 
two numbers do have any factors in common (and what their greatest 
common factor is) can be determined by a simple algorithm known to Euclid 
and easily executed by Bob on a classical computer. The Euclidean algorithm 
is described in Appendix J. 

3 This can easily be done classically as a straightforward embellishment of the 
Euclidean algorithm. See Appendix J. 

4 I assume that Alice’s unencoded message a, and hence her coded message b, 
is in G Pq - i.e. that a is not a multiple of p or q. Since p and q are huge 
prime numbers, the odds against a being such a multiple are astronomical. 
But if Eve wants to be insanely careful she can find the greatest common 
factor of b and TV, using the Euclidean algorithm. In the grossly improbable 
case that it turns out not to be 1, Eve will have factored TV and can decode 
Alice’s message the same way Bob does. 
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as the order of a. This is because the subgroup of G Pq generated by 
a contains a c = b, and hence it contains the subgroup generated by 
b\ but the subgroup generated by b contains b d ' = a, and hence the 
subgroup generated by a . Since each subgroup contains the other, they 
must be identical. Since the order of a or b is the number of elements 
in the subgroup it generates, their orders are the same. So if Eve can 
find the order r of Alice’s code message b, then she has also learned 
the order of Alice’s original text a . 

Since Bob has picked c to have no factors in common with 
(p — 1 ){q — 1), and since r divides the order (p — 1 )(q — 1) of G pq , 
the coding integer c can have no factors in common with r. So c is 
congruent modulo r to a member c r of G r , which has an inverse d r in 
G r , and d’ is also a modulo-r inverse of c: 

cd’ = 1 (mod r). (3.12) 

Therefore, given c (which Bob has publicly announced) and r (which 
Eve can get with her period-finding program from Alice’s encoded 
message b and the publicly announced value ofN= pq ), it is easy for 
Eve to calculate d f with a classical computer, using, modulo r , the same 
extension of the Euclidean algorithm as Bob used to find i, modulo 
{p — 1 ){q — 1). It then follows that for some integer m 

b d = a cd = a l+mr = a(a r y i = a (mod pq). (3.13) 

Eve has thus used her ability to find periods to decode Alice’s encoded 
message b = a c to reveal Alice’s original message a. 

This use of period finding to defeat RS A encryption is summarized 
in Table 3.1. 


3.4 Quantum period finding: preliminary remarks 

So we can crack the RSA code if we have a fast way to find the period 
r of the known periodic function 

f(x) = b x (mod AO- (3.14) 

This might appear to be a simple task, especially since periodic func¬ 
tions of the special form (3.14) have the simplifying feature that 
f(x+s) = f(x) only if 5 is a multiple of the period r . But b x (mod N) 
is precisely the kind of function whose values within a period hop about 
so irregularly as to offer no obvious clues about the period. One could 
try evaluating f{x) for random x until one found two different values 
of x for which f agreed. Those values would differ by a multiple of 
the period, which would provide some important information about 
the value of the period itself. But this is an inefficient way to proceed, 
even classically. 
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Table 3.1 . A summary of RSA encryption and how to break it with a fast period-finding routine on a 
quantum computer. Bob has chosen the encoding numbers to have an inverse d modulo (p — 1 ){q — 1) 
so c can have no factors in common with (p — 1 ){q — 1). Since Alice’s encoded message b is in G Pq , 
its order r is a factor of the order (p — 1 ){q — 1) of G pq . So c can have no factors in common with r , 
and therefore has an inverse d' modulo r . Because b is a power of a and vice versa, each has the same 
order r in G Pq . Therefore b d ' = a cd ' = a l+mr = a modulo TV. 


Bob knows 

Alice knows 

Public knows 

p and q (primes); 
c and d satisfying 
cd = 1 (mod (p - 1)(g - 1)); 
b (encoded message). 

a (her message); 
only c (not d) and only N = pq ; 
b = a c (mod N) (encoded message). 

b (encoded message); 
only c (not c/); 
only N = pq. 

Decoding: 

a = b d (mod A/). 


Quantum decoding: 

Quantum computer 
finds r: b r = 1 (mod A/); 
classical computer finds 
d'\ cd' = 1 (mod r); 
a = b d ' (mod A/). 


Let be the number of bits in TV = pq, so that 2”° is the smallest 
power of 2 that exceeds TV. If TV is a 500-digit number - a typical size 
for cryptographic applications - no will be around 1700. This also sets 
the scale for the typical number of bits in the other relevant numbers 
a, b, and their modulo- TV period r . To have an appreciable probability 
of finding r by random searching requires a number of evaluations of 
f that is exponential in no (just as in the classical approach to Simon’s 
problem, described in Chapter 2). There are classical ways to improve 
on random searching, using, for example, Fourier analysis, but no clas¬ 
sical approach is known that does not require a time that grows faster 
than any power of no. With a quantum computer, however, quantum 
parallelism gets us tantalizingly close (but, as in Simon’s problem, not 
close enough) to solving the problem with a single application of U/, 
and enables us to solve it completely with probability arbitrarily close 
to unity in a time that grows only as a low-order polynomial in hq. 

To deal with values of x and f(x) = b x (mod TV) between 0 and TV, 
both the input and output registers must contain at least no Qbits. For 
reasons that will emerge in Section 3.7, however, to find the period r 
efficiently the input register must actually have n = 2no Qbits. Dou¬ 
bling the number of Qbits in the input register ensures that the range of 
values of x for which f(x) is calculated contains at least N full periods 
of /. This redundancy turns out to be essential for a successful deter¬ 
mination of the period by Shor’s method. (We shall see in Section 3.7 
that if p and q both happen to be primes of the form V + 1 then - and 
only then - the method works without doubling the size of the input 
register. Thus A^ = 15 = (2 + 1)(2 2 + 1) does not provide a realistic 
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test case for laboratory attempts to demonstrate Shor’s algorithm for 
small p and q with real Qbits.) 

We begin the quantum period-finding algorithm by using our quan¬ 
tum computer in the familiar way to construct the state 

1 2 n — \ 

^ k)»l/W)«o ( 3 ' 15 ) 

Z x=0 


with a single application of U/. In Section 3.8 we take a closer look at 
how this might efficiently be done in the case of interest, f(x) = b x 
(mod TV). Once the state of the registers has become (3.15), we can 
measure the w-Qbit output register. 5 If the measurement yields the 
value then the generalized Born rule tells us that the state of the 
w-Qbit input register can be taken to be 




m — 1 

/, ko + kr)„ 

k =o 


(3.16) 


Here is the smallest value of x (0 < vo < r) for which f(x o) = fo , 
and m is the smallest integer for which mr + x o > 2”, so 



"2” ’ 


~2 n " 

m = 

r 

or m = 

r 



(3.17) 


depending on the value of xo (where [x] is the integral part of x - the 
largest integer less than or equal to x). As in the examples of Chapter 2, 
if we could produce a small number of identical copies of the state (3.16) 
the job would be done, for a measurement in the computational basis 
would yield a random one of the values xo + kr , and the difference 
between the results of pairs of measurements on such identical copies 
would give us a collection of random multiples of r from which r itself 
could straightforwardly be extracted. But this possibility is ruled out by 
the no-cloning theorem. All we can extract is a single value of xq + kr 
for unknown random vo, which is useless for determining r. And, of 
course, if we ran the whole algorithm again, we would end up with a state 
of the form (3.16) for another random value of vo, which would permit 
no useful comparison with what we had learned from the first run. 

But, as with Simon’s problem, we can do something more clever 
to the state (3.16) before making our final measurement. The problem 
is the displacement by the unknown random vo, which prevents any 
information about r from being extracted in a single measurement. We 
need a unitary transformation that transforms the xq dependence into 


5 It is not, in fact, necessary to measure the output register. One can continue 
to work with the full state (3.15) in which one breaks down the sum on x into 
a sum over all the different values of f and a sum over all the values of x 
associated with each value of /. The only purpose of the measurement is to 
clarify the analysis by eliminating a lot of uninteresting additional structure, 
coming from the sum on the values of /, that plays no role beyond making 
many of the subsequent expressions somewhat lengthier. 
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a harmless overall phase factor. This is accomplished with the quantum 
Fourier transform. 


3.5 The quantum Fourier transform 

The heart of Shor’s algorithm is a superfast quantum Fourier trans¬ 
form, which can be carried out by a spectacularly efficient quantum 
circuit built entirely out of 1-Qbit and 2-Qbit gates. The n-Qbit quan¬ 
tum Fourier transform is defined to be that unitary transformation Uft 
whose action on the computational basis is given by 

i 2 n — l 

u ft |x) k = ^- 2 J2 e2n,xy/2 " l y )*- < 3 - 18 ) 

y =o 

The product xy is here ordinary multiplication. 6 One easily verifies that 
Uprk) is normalized to unity and that UftW is orthogonal to UftI^ ) 
unless v = x\ so Uft is unitary. Unitarity also emerges directly from 
the analysis that follows, which explicitly constructs Uft out of 1- and 
2-Qbit unitary gates. The unitary Uft is useful because, as one also 
easily verifies, applied to a superposition of states \x) with complex 
amplitudes ]/(v), it produces another superposition with amplitudes 
that are related to y(x) by the appropriate discrete Fourier transform: 

( 2 n — l \ 2 n — l 

^3 y(x)|x> ) = ^3 y(x)|x>, (3.19) 

x=0 / x=0 

where 

y^) = f 1 T J ^ ,xyl2 "Y(y). (3.20) 

y= 0 

The celebrated classical fast Fourier transform is an algorithm re¬ 
quiring a time that grows with the number of bits as nl n (rather than 

2 

(2”) as the obvious direct approach would require) to evaluate y . But 
there is a quantum algorithm for executing the unitary transformation 
Uft exponentially faster than fast, in a time that grows only as n 2 . 
The catch, as usual, is that one does not end up knowing the complete 


6 A warning to physicists (which others can ignore). This looks deceptively 
like a (discretized) transformation from a position to a momentum 
representation, and one’s first reaction might be that it is (perhaps 
disappointingly) familiar. But it has, in fact, an entirely different character. 
The number x is the integer represented by the state lx); it is not the 
position of anything. Changing x to x + 1 induces an arithmetically natural 
but physically quite unnatural transformation on the computational-basis 
states, determined by the laws of binary addition, including carrying. It bears 
no resemblance to anything that could be associated with a spatial translation 
in the physical space of Qbits. So your eyes should not glaze over, and you 
should regard Uft as a new and unfamiliar physical transformation of Qbits. 
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set of Fourier coefficients, as one does after applying the classical fast 
Fourier transform. One just has n Qbits described by the state given 
by the right side of (3.19), and as we have repeatedly noted, having a 
collection of Qbits in a given state does not enable one to learn what that 
state actually is. There is no way to extract all the Fourier coefficients 
y, given an n-Qbit register in the state (3.19). But if y is a periodic 
function with a period that is no bigger than 2 n / 2 , then a register in 
the state (3.19) can give powerful clues about the precise value of the 
period r , even though r can be hundreds of digits long. 

Notice the resemblance of the quantum Fourier transform (3.18) 
to the n -fold Hadamard transformation. Since —1 = e n \ the n -fold 
Hadamard (2.30) assumes the form 


H®"|x>„ 


1 

2»/2 


T-\ 


J2^ ,x ' y \y) 

y =o 


(3.21) 


Aside from the different powers of 2 appearing in the quantum Fourier 
transform (3.18) - so the factors of modulus 1 in the superposition are 
not just 1 and — 1 - the only other difference between the two transforms 
is that xy is ordinary multiplication in the quantum Fourier transform, 
whereas x • y is the bitwise inner product in the n -fold Hadamard. 
Because the arithmetic product xy is a more elaborate function of 
v and y than x • j/, the quantum Fourier transformation cannot be 
built entirely out of 1-Qbit unitary gates as the n -fold Hadamard is. 
But, remarkably, it can be constructed entirely out of 1- and 2-Qbit 
gates. Even more remarkably, when the procedure is used for period 
finding all of the 2-Qbit gates can be replaced by 1-Qbit measurement 
gates followed by additional 1-Qbit unitary gates whose application is 
contingent on the measurement outcomes. 

To construct a circuit to execute the quantum Fourier transform 
Uft, it is convenient to introduce an w-Qbit unitary operator Z , diag¬ 
onal in the computational basis: 


Z\y)„ =e 2 * i > l2 '\y) n 


(3.22) 


This can be viewed as a generalization to n Qbits of the 1-Qbit operator 
Z, to which it reduces when n = 1. Using the familiar relation 


H®"|0) 


n 



we can reexpress the definition (3.18) as 


U FT k>H = Z x H® n \0) n . 


(3.23) 


(3.24) 


This gives Un-|.v)„ as an x-dependent operator acting on the state |0). 

We next reexpress the right side of (3.24) as an x-independent linear 
operator acting on the state \x) n . Since the computational-basis states 
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| x) n are a basis, this will give us an alternative expression for Uft 
itself. The construction of this alternative form for (3.24) is made more 
transparent by specializing to the case of four Qbits. The structure 
that emerges in the case n = 4 has an obvious extension to general n . 
Dealing with the case of general n from the start only obscures things. 

When n = 4 we want to find an appropriate form for 

UFTk 3 )k 2 >ki>ko) = Z'HMHxHommom. (3.25) 

As usual, we number the Qbits by the power of 2 with which they 
are associated, with the least significant on the right, so that, reading 
from right to left, the Qbits are labeled 0, 1, 2, and 3; H i acts on 
the Qbit labeled i (and as the identity on all other Qbits). If \y)$ = 
lT 3 > IT 2 ) Iti) Ito) in the definition (3.22) of Z, so that y = 83/3 + 4j / 2 + 
2y\ + 3 / 0 , then the operator Z can be constructed out of single-Qbit 
number operators: 

Z = exp^— ( 8 n 3 + 4n 2 + 2ni + n 0 )^. (3.26) 

The operator Z x appearing in (3.25) then becomes 

Z x = exp ( 8 v 3 + 4x 2 + 2x\ + vo)( 8 n 3 + 4n 2 + 2ni + n 0 ) 

(3.27) 

Because the 1-Qbit operator exp(27n n) acts as the identity on either 
of the 1-Qbit states |0) and 11), and because any 1-Qbit state is a 
superposition of these two, n obeys the operator identity 

exp(27nn) = 1 . (3.28) 

Therefore, in multiplying out the two terms 

( 8 x 3 + 4 x 2 + 2x\ + vo) ( 8 n 3 + 4n 2 + 2ni + n 0 ) (3.29) 

appearing in the exponential (3.27), we can drop all products x t nj 
whose coefficients are a power of 2 greater than 8 , getting 

Z x = exp[^ 7 r(von 3 + (x\ + ^vo)n 2 + (v 2 + \x\ + ^vo)ni 

+ (v 3 + \x 2 + \x\ + ^v 0 )n 0 )]. (3.30) 

Note next that the number and Hadamard operators for any single 
Qbit obey the relation 



exp(z7rvn)H|0) = H|v). (3.31) 

This is trivial when x = 0, and when x = 1 it reduces to the correct 
statement 


(-l) n ^(|0> + |l)) = ^(|0)-|l». 


(3.32) 
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(Alternatively, note that exp(z 7 in) = Z and ZH = HX.) The effect on 
(3.25) of the four terms in (3.30) that do not contain factors of or 
| is to produce the generalization of (3.31) to several Qbits: 

exp[i7r(ffon 3 + x\n 2 + v 2 ni + ^ 3 n 0 )] H 3 H 2 H i H 0 10) |0) |0) |0) 

= [exp(/7r l rori3)H3][exp(z7r l rin 2 )H 2 ][exp(z7rv 2 ni)Hi] 

X [exp(«7rx 3 n 0 )Ho] |0) |0> |0) |0) 

= H 3 H 2 H 1 Ho|xo)|x 1 )|x 2 )|x 3 >. (3.33) 

We have used the fact that number operators associated with different 
Qbits commute with one another. Note also that because the number 
operator n l is multiplied by V 3 on the left side of (3.33), the state of 
the Qbit labeled i on the right is \x?> 

The remaining six terms in (3.30) (containing fractional coefficients) 
further convert (3.25) to the form 

U F t|x 3 )|x 2 )|xi)|xo) = exp[z'7r(jx 0 n 2 + {\x\ + ^x 0 )ni 

+ (\ x 2 + \x\ + |xo)n 0 )]H 3 H 2 HiHo|x 0 )|xi)|x 2 )|x 3 ). (3.34) 

Since the Hadamard transformation commutes with the number 
operator n j when i ^ j , we can regroup the terms in (3.34) so that each 
number operator n* appears immediately to the left of its corresponding 
Hadamard operator hfi: 

UFTk3>k2)ki)ko> = h 3 exp[z7rn 2 ^vo]H 2 exp[z7rni(|vi + ^o)]Hi 

x exp[z 7 rn 0 (^v 2 + \x\ + ^v 0 )]H 0 
x \x 0 )\xi)\x 2 )\x 3 ). (3.35) 

The state |vo) \x\) \x 2 ) \xs) is an eigenstate of the number operators 
113 , n 2 , ni, no with respective eigenvalues xo, x\, x 2 , # 3 . If we did not 
have to worry about Hadamard operators interposing themselves be¬ 
tween number operators and their eigenstates, we could replace each 
Xi in (3.35) by the number operator n 3 _* of which it is the eigenvalue 
to get 

U F t|x 3 )|x 2 )|xi)|x 0 ) = H 3 exp[/7r|n 2 n 3 ]H 2 exp[/7rni(|n 2 + |n 3 )] 

x Hi exp[/7rn 0 (|ni + \n 2 + gn 3 )]H 0 
x |x 0 >|x 1 )|x 2 )|x 3 ). (3.36) 

But as (3.36) makes clear, we do indeed not have to worry, because every 
H* appears safely to the left of every that has replaced an X 3 
If we define 2-Qbit unitary operators by 


v , 7 = exp(ijTn,nj/2 1 ' 7l ), 


(3.37) 
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then (3.36) assumes the more readable form 

u F Tk3>k2>ki)ko> 

= H 3 (V 32 H 2 )(V 3 lV 21 H 1 )(V 3 oV 2 oV 1 oHo)|Xo)|x 1 >|X 2 >|X 3 >. (3.38) 

I have put in unnecessary parentheses to guide the eye to the simple 
structure, whose generalization to more than four Qbits is, as promised, 
obvious. 

If we define the unitary operator P to bring about the permutation 
of computational basis states 

P|x 3 >|x 2 )|xi)|xo) = |xo)|xi)|x 2 )|x 3 ), (3.39) 

then (3.38) becomes 

u F Tk3>k2)ki>ko) 

= H 3 (V 32 H 2 )(V 3 lV 2 lH 1 )(V 3 oV 2 oV 1 oHo)P|X 3 >|x 2 )|xi)|xo}. (3.40) 

Since (3.40) holds for all computational-basis states it holds for arbi¬ 
trary states and is therefore equivalent to the operator identity 


U FT = H 3 (\/ 32 H 2 )(\/ 3l V 2l H l )(\/ 30 \/ 20 \/ l0 H () )P. (3.41) 

The form (3.41) expresses U F t as a product of unitary operators, 
thereby independently establishing what we have already noted directly 
from its definition, that U F t is unitary. More importantly it gives an 
explicit construction of U F t entirely out of one- and ^-Qbit unitary 
gates, whose number grows only quadratically with the number n of 
Qbits. (The permutation P can be constructed out of cNOT gates and 
one additional Qbit, initially in the state |0) - an instructive exercise 
to think about - but in the application that follows it is much easier to 
build directly into the circuitry the rearranging of Qbits accomplished 
by P.) 

The permutation operator P plays an important role in establishing 
that the circuit (3.41) that produces the quantum Fourier transform 
U F t has an inverse Up T possessing the structure one expects for an 
inverse Fourier transform. Since the adjoint of a product is the product 
of the adjoints in the opposite order, and since Hadamards and P are 
self-adjoint, we have from (3.41) 

Up T = P(H 0 vi 0 vt 0 vt 0 )(H 1 V>t i) ( H 2vL)H3. (3.42) 

One can insert 1 = PP on the extreme right of (3.42) and then note 
that the effect of sandwiching all the Hadamards and 1-Qbit unitaries 
between two Ps is simply to alter all their indices by the permutation 
taking 0123 —>► 3210. Therefore 

Up T = (H 3 vt 3 v| 3 v| 1 3)(H2V{ 2 vt 2 )(H 1 v| |1 )HoP. 


(3.43) 
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Fig 3.1 


A diagram of a circuit that illustrates, for four Qbits, the 
construction of the quantum Fourier transform Uft defined in (3.18) 
as the product of 1- and 2-Qbit gates given in (3.40). 


If we now move every to the right past as many Hadamards as we 
can, keeping in mind that each V commutes with all Hadamards except 
those sharing either of its indices, then we have 

U’ T = (H 3 V^)(H 2 vi 3 vi 2 )(H 1 v| l 3Vt 2 vt l) HoP. (3.44) 

Finally, if we note from (3.37) that each V is symmetric in its indices, 
and rearrange the parentheses in (3.44) to make easier the comparison 
with the form (3.41) of Uft, we have 

Up T = H 3 (V^H 2 )(vt 1 vt 1 H 1 )(V^ 0 vt 0 v{ 0 Ho)P. (3.45) 

This is precisely the form (3.41) of Uft itself, except that each V is 
replaced by its adjoint, which (3.37) shows amounts to replacing each 
i by —i in the arguments of all the phase factors. This is exactly what 
one does to invert the ordinary functional Fourier transform. 


3.6 Eliminating the 2-Qbit gates 

A circuit diagram that compactly expresses the content of (3.40) is 
shown in Figure 3.1. As is always the case in such diagrams, the order 
in which the gates act is from left to right, although in the equation 
(3.40) that the diagram represents, the order in which the gates act is 
from right to left. The diagram introduces an artificial asymmetry into 
the 2-Qbit unitary gate \l tJ , treating one Qbit as a control bit, which 
determines whether or not the unitary operator e l7lXX Z 2 '' A acts on the 
other Qbit, taken to be the target. Although this is the most common 
way of representing the circuit for the quantum Fourier transform, the 
figure could equally well have been drawn with the opposite convention, 
as in Figure 3.2. 

Both Figure 3.1 and Figure 3.2 follow the usual convention, in which 
Qbits representing more significant bits are represented by lines higher 
in the figure. Acting on the computational basis, however, the first gate 
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Fig 3.2 


Since the action (3.37) of the controlled- V gates is symmetric 
in i and /, Figure 3.1 can be redrawn with control and target Qbits 
interchanged. 


on the left, P, permutes the states of the Qbits, exchanging the states of 
the most and least significant Qbits, the states of the next most signifi¬ 
cant and next least significant Qbits, etc. Rather than introducing such 
a permutation gate, it makes more sense simply to reverse the conven¬ 
tion for the input state, associating Qbits that represent more significant 
bits with lower lines in the figure. The gate P is then omitted, and the 
conventional ordering of significant bits is reversed for the input. The 
complete figure thus reduces to the portion to the right of the per¬ 
mutation gate P. For the output, of course, the conventional ordering 
remains in effect: Qbits on higher lines represent more significant bits. 

If the input on the left of the complete Figure 3.1 or 3.2 (with the gate 
P) is the computational-basis state \x)^ = \xi)\x 2 )\x\)\xo) the output 
on the right will be Uft \x)i, the superposition (3.18) of computational- 
basis states | y) 4 = |j/3>lj2>lji)lj / o), defined in (3.18). 

There is no need for the figures to have subscripts on the Hadamard 
gates appearing in (3.40), since each is explicitly attached to the line 
associated with the Qbit on which it acts. For the same reason each 
2-Qbit controlled- V gate requires only a single subscript, which spec¬ 
ifies the unitary operator V£ that acts on the target Qbit when the 
computational-basis state of the control Qbit is |1); the subscript k is 
the number of “wires” the target Qbit is away from the control Qbit. 
The explicit form of is e lTTV ^^\ where n is the number operator for 
the target Qbit. 

Figure 3.2 reveals a further simplification of great practical inter¬ 
est, if all the Qbits are measured as soon as the action of the quantum 
Fourier transformation is completed. This simplification, pointed out 
by Griffiths and Niu, allows the 2-Qbit controlled- V gates to be re¬ 
placed by 1-Qbit gates that act or not, depending on the outcome of 
a prior measurement of the control Qbit, as shown in Figure 3.3. The 
simplification is made possible by the following general fact. 

If a controlled operation C* 7 , or a series of consecutive controlled 
operations all with the same control Qbit, is immediately followed by 
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Fig 3.3 


If the Qbits are all measured immediately after all the gates of 
the quantum Fourier transform have acted, then the 1-Qbit 
measurement gates can be applied to each Qbit immediately after the 
action of the Hadamard gate on that Qbit, and the controlled- V gates 
that follow the action of the Hadamards in Figure 3.2 can be replaced 
by 1-Qbit gates that act or not depending on whether the outcome y of 
the 1-Qbit measurement is 1 or 0. 


a measurement of the control Qbit, then the possible final states of all 
the Qbits and the probabilities of those states are exactly the same as 
they would be if the measurement of the control Qbit took place before 
the application of the controlled operation, and then the target Qbit(s) 
were acted upon or not by U, depending on whether the result of the 
prior measurement was 0 or 1. To confirm this, write an w-Qbit state as 

Ww = ao|0)i|Oo) w -i + oq|l)i|<I>i) w _i, (3.46) 

where the state of the control Qbit is on the left, the states | ) are unit 

vectors, and the unitary operation U acts on some or all of the remaining 
n — 1 Qbits. Applying the controlled -U operation C u to \k w gives 

C"!*), =ao|0>i|cDo>„_ 1 +ai|l>iU|0 1 > B _ 1 . (3.47) 

If this is immediately followed by a measurement of the control Qbit, 
the post-measurement states and associated probabilities are 

|0)|<Do>, p = |«ol 2 ; |l}U|4>i), p = |«i| 2 , (3.48) 

according to the generalized Born rule. On the other hand if we 
measure the control Qbit before applying the controlled- t/, the 
resulting states and associated probabilities are 

|0)|<D 0 ), p = |a 0 | 2 ; |l)|0>i), p = ki| 2 , (3.49) 

so if we then apply U to the remaining n — 1 Qbits if and only if the 
result of the earlier measurement was 1, we end up with exactly the 
same states and probabilities as in (3.48). 

We shall see that if one’s aim is to find the period of the function 
/, one can indeed measure each Qbit immediately after applying the 
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quantum Fourier transform. So this replacement of controlled uni¬ 
tary gates by 1-Qbit unitary gates, which act or not depending on the 
outcome of the measurement, is of great importance from the tech¬ 
nological point of view, 1-Qbit unitaries being far easier to implement 
than 2-Qbit controlled gates. 

To see how the general procedure works in this particular case, 
consider first the bottom wire in Figure 3.2. Once H and the three 
controlled- V gates have acted on it, nothing further happens to that 
Qbit until its final measurement. If the result of that measurement 
is 1, the state of all four Qbits reduces to that component of the 
full superposition in which Vi, V 2 , and V 3 have acted on the three 
wires above the bottom wire; if the result of the measurement is 0 , the 
4-Qbit state reduces to the component in which they have not acted. 
We can produce exactly the same effect if we measure the least signif¬ 
icant output Qbit immediately after H has acted on the bottom wire, 
before any of the other gates have acted, and then apply or do not apply 
the three unitary transformations to the other three Qbits, depending 
on whether the outcome of the measurement is 1 or 0. Next, we apply 
the Hadamard transformation to the second wire from the bottom. We 
then immediately measure that Qbit and, depending on the outcome, 
apply or do not apply the appropriate 1-Qbit unitary transformations 
to each of the remaining two Qbits. Continuing in this way, we end up 
producing exactly the same statistical distribution of measurement re¬ 
sults as we would have produced had we used the 2-Qbit controlled- V 
gates, measuring none of the Qbits until the full unitary transformation 
Uft had been produced. Thus Figure 3.2, followed by measurements 
of all four Qbits on the right yielding the values 3 / 3 , j/ 2 , yu an d Jo? is 
equivalent to Figure 3.3. 

The most attractive (but least common) way of representing the 
quantum Fourier transform with a circuit diagram is shown in Figure 
3.4 . 7 In this form the inversion in order from most to least significant 
Qbits between the input and the output is shown by bending the Qbit 
lines, rather than by inverting the order in the state symbols. The 2- 
Qbit gates V are also displayed in a symmetric way that does not suggest 
an artificial distinction between control and target Qbits. 


3.7 Finding the period 

The period r of f appears in the state (3.16) of the input-register Qbits 
produced from a single application of U f. To get valuable information 


7 The figure is based on one drawn by Robert B. Griffiths and Chi-Sheng Niu 
in their paper setting forth the Griffiths-Niu trick, “Semiclassical Fourier 
transform for quantum computation,” Physical Review Letters 76, 3228-3231 
(1996) (http: / /arxiv. org/abs/quant-ph/9 5110 07). 
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Fig 3.4 


A more symmetric 
way of drawing Figure 3.1 
or 3.2, due to Griffiths and 
Niu. Although it is 
superior to the 
conventional diagram, it 
does not seem to have 
caught on. The 
permutation P that in 
effect permutes the Qbits 
in the input register is now 
built into the diagram by 
using lines that no longer 
connect input-register 
Qbits to output-register 
Qbits at the same 
horizontal level. Because 
the lines now cross one 
another, the unitary 
operators V can be 
represented by the circles 
at the intersections of the 
lines associated with the 
Qbits that they couple, 
eliminating the artificial 
distinction between control 
and target Qbits used in 
Figures 3.1 and 3.2. The 
form of each such operator 
is V = exp(/7rnn'/2*), 
where n and n' are the Qbit 
number operators 
associated with the two 
lines that cross at the dot, 
and k = 1, 2, or 3 
depending on whether the 
dot lies in the first, second, 
or third horizontal row 
below the top row of 
Hadamard 

transformations. The 
larger the phase produced 
by V, the blacker the circle. 



about r we apply the quantum Fourier transformation (3.18) to the 
input register: 



m — 1 

/ ] |xo + kr) 
k=0 




m — 1 


'y ' e 2ni(xQ+kr)y/2 n 
k=0 




(3.50) 


If we now make a measurement, the probability p(y) of getting the 
result y is just the squared magnitude of the coefficient of \y) in (3.50). 
The factor e lT[lx wl in which the formerly troublesome xo explicitly 
occurs, drops out of this probability 8 and we are left with 

1 2 

1 


p(y) = 


2 n m 


m 


E 

k=0 


2tt i kry / 2 ' 


(3.51) 


This completes the quantum-computational part of the process, 
except that, as noted below, we may have to repeat the procedure a 
small number of times (of order ten or so) to achieve a high probability 
of learning the period r. To see why the form (3.51) of p{y) makes this 
possible, we require some further purely mathematical analysis, that at 
a certain point will exploit yet another branch of elementary number 
theory. 

The probability (3.51) is a simple explicit function of the integer 3 /, 
whose magnitude has maxima when y is close 9 to integral multiples of 


8 The random value of vo < r also determines whether m is given by rounding 
the enormous value of 2 n /r up or down to the nearest integer - see Equation 
(3.17) and the surrounding text - but this makes a negligible difference in 
what follows. 

9 Such sums of phase factors are familiar to physicists (to whom this 
cautionary footnote is addressed), particularly in the context of 
time-dependent perturbation theory, where one approximates them in terms 
of Dirac delta-functions concentrated in the maximum values. The analysis 
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2”/r. In fact we now show that the probability is at least 0.4 that the 
measured value of y will be as close as possible to - i.e. within ^ of — 
an integral multiple of 2” /r. To see this we calculate a lower bound for 
p{y) when 


y=yj=j2 H /r+S j9 (3.52) 

with 1 8j | < Only the term in Sj contributes to the exponentials in 
(3.51). The summation is a geometric series, which can be explicitly 
summed to give 



1 sin 2 (7i 8 jfnr / 2 n ) 
2 n m sin 2 (7r3 7 r/2 w ) 


(3.53) 


Since (3.17) tells us that m is within an integer of 2”/r, and since 
2 ”/r > N 2 /r > TV, we can with negligible error replace mr/2” by 1 
in the numerator of (3.53), and replace the sine in the denominator by 
its (extremely small) argument. This gives 


p{yd = 


1 ( sin(7T<5 7 ) 


m 


m 


i 


TtSjT /2 n 


1 / sin(7T<5 7 ) 
r \ 7t8j 


(3.54) 


When v is between 0 and 7 t/ 2 , the graph of sin x lies above the 
straight line connecting the origin to the maximum at x = 7t/2: 


*/(H < sinv, 0 < v < 7 t/2 . (3.55) 

Since <5 7 < | the probability (3.54) is bounded below by 

Piyj) > (4 /n 2 )/r. (3.56) 

Since there are at least r — 1 different values of j , and since r is a large 
number , 10 one has at least a 40% chance ( 4 / 7 T 2 = 0.4053) of getting 
one of the special values (3.52) for y - a value that is within ^ of an 
integral multiple of 2 n /r . 

Note, in passing, that as 8 j —>► 0 in (3.54) the probability p ()/ 7 ) 
becomes 1 /r, so that if all the 8 j are 0 - i.e. if the period r is exactly a 
power of 2 - then the probability of measuring an integral multiple of 
2 n /r is essentially 1. Indeed, you can easily check that in this (highly 
unlikely) case the probability remains 1 even if we do not double the 
number of Qbits in the input register and take n — no. Thus the case 
r = 2 J avoids some of the major complications of quantum period 


required here is different in two important ways. Because we need to know 
the enormous integer r exactly we must pay much more careful attention to 
just how much of the probability is concentrated in those special values of 
y, and we must also solve the subtle problem of how to get from such 
maximum values to the precise period r itself. 

10 One can easily test with a classical computer all values of r less than, say, 
100, to see whether they are periods of f ; one need resort to the quantum 
computation only if r itself is enormous. 
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finding. Since r divides (p — 1 ){q — 1), all periods modulo pq must be 
powers of 2 if p and q are both primes of the form 2” + 1. The smallest 
such primes are 3, 5, 17, and 257. Hence claims to have realized the 
Shor algorithm for factoring 15 are to be taken cumgrano salis , as should 
possible future claims based on factoring 51, 85, and 771. 

Note also that the derivation of (3.56) requires only that the argu¬ 
ment of the sine in the denominator of (3.53) be small. This will be the 
case if 2 n is any large multiple of TV - i.e. if the input register is large 
enough to contain many periods of b 1 (mod TV). The stronger require¬ 
ment that 2” should be as large as TV 2 - that the input register should 
actually be able to accomodate at least TV full periods - emerges when 
we examine whether it is possible to learn r itself, given an integral 
multiple of 2” / r. 

Suppose that we have found a y that is within ^ of j2 n /r for some 
integer j . It follows that 


y_ _ i 

2 n r 



(3.57) 


Since y is the result of our measurement and we know «, the number 
of input-register Qbits, we have an estimate for the fraction j/r. It is 
here that our use of an n-Qbit input register with 2 n > TV 2 is crucial. 
By using twice as many Qbits as needed to represent all the integers 
up to TV, we have ensured that our estimate (3.57) of j/r is off by 
no more than 1/(2 TV 2 ). But since r < TV, and since any two distinct 
fractions with denominators less than Nmust differ 11 by at least 1/TV 2 , 
the measured value of y and the fact that r is less than TV is enough to 
determine a unique value of the rational number j/r. 

That value of j/r can be efficiently extracted from the known value 
of y/2 n by an application of the theory of continued fractions. This 
exploits the theorem that if x is an estimate for j/r that differs from it 
by less than 1 /(2r 2 ), then j/r will appear as one of the partial sums in 
the continued-fraction expansion of x. The application of the theorem 
in this context is illustrated in Appendix K. The continued-fraction 
expansion of y/2 n gives us not j and r separately, but the fraction j/r 
reduced to lowest terms - i.e. it gives us integers jo and ro with no 
common factors that satisfy yo Ao = j A • The ro we learn is thus a 
divisor of r. 

Since r is ro times the factors j has in common with r, if we were 
lucky enough to get a j that is coprime to r, then ro = r. Since, as shown 
in Appendix J, two random numbers j and r have a better than even 
chance of having no common factors, we do not have to be terribly lucky. 


11 For 


a c 1 

- > — 

b d bd 


unless the two fractions are identical. 
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We can easily check to see whether ro itself is the period r by computing 
(with a classical computer) b r " (mod TV) and seeing whether or not it is 
b . If it is not, we can try several low multiples, 2ro, 3ro, 4ro, ..since 
it is unlikely that j will share a large factor with r . 

If this fails, we can repeat the entire quantum computation from 
the beginning. We now get j'/r , where j' is another (random) integer, 
yielding another divisor of r , which is r divided by the factors it has 
in common with j'. If j and j' have no factors in common - which 
has a better than even chance of happening - then r will be the least 
common multiple 12 of its two divisors ro and r f 0 . We can again test to see 
whether we have the right r by evaluating b r (mod TV) to see whether 
it is indeed equal to b . If it is not, we can again try some of the lower 
multiples of our candidate for r and, if necessary, go through the whole 
business one more time to get yet another random multiple of 1 /r. 

Because we are not certain that our measurement gives us one of the 
yj and thus a divisor of r , we may have to repeat the whole procedure 
several (but not a great many) times before succeeding, carrying out 
some not terribly taxing mathematical detective work, with the aid of a 
classical computer, to find the period r . The detective work is greatly 
simplified by the fact (established in Appendix L) that when TV is the 
product of two primes, the period r is not only less than TV, but also 
less than | TV. As a result, a more extended analysis shows that the 
probability of learning a divisor of r from the measured value of y is 
bounded from below not just by 0.4, but by more than 0.9. Furthermore, 
by adding just a small number q of additional Qbits to the input register, 
so that n exceeds 2«o + q , the probability of learning a divisor of r in a 
single run can be made quite close to 1. These refinements are described 
in Appendix L. 


3.8 Calculating the periodic function 

We have assumed the existence of an efficient subroutine that calcu¬ 
lates b 1 (mod TV). You might think that calculating f(x) = b x (mod 
TV) for arbitrary values of x less than, say, 2 n = lO 800 would require 
astronomical numbers of multiplications, but it does not. We simply 
square b (mod A/), square the result (mod A/), square that, etc., cal¬ 
culating the comparatively small number of powers b v (mod TV) with 
j < n. The binary expansion of x = x n -\x n -2 • • • tells us which 
of these must be multiplied together to get b x = 1 \y 2 r ■ 

So if we start with x in the input register, 1 (i.e. 000... 001) in 
the output register, and b in an additional work register, then we can 
proceed as follows: 


12 The least common multiple of two numbers is their product divided by 
their greatest common divisor; the greatest common divisor can be found 
with the Euclidean algorithm, as shown in Appendix J. 
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(a) multiply the ouput register by the work register if and only if 

= 1 ; 

(b) replace the contents of the work register by its modulo-TV square; 
(a ) repeat (a) with the multiplication now conditional on x\ = 1; 

(b ) repeat (b); 

(a") repeat (a) with the multiplication now conditional on — 1; etc. 

At the end of this process we will still have x in the input register (which 
serves only as a set of control bits for the n controlled multiplications), 
and we will have b x (mod TV) in the output register. The work register 

fi 

will contain b whatever the value of x in the input register, and it 
will therefore be unentangled with the input and output registers and 
can be ignored when we take our starting point to be a superposition 
of classical inputs. 13 

Note the striking difference between classical and quantum pro¬ 
gramming styles. One’s classical computational instincts would direct 
one to make a look-up table of all n modulo-N multiple squares of b, 
since (a) Chits are cheap and stable and (b) otherwise to get b x (mod 
N) for all the needed values of x one would have to recalculate the 
successive squares so many times that this would become ridiculously 
inefficient. But the situation is quite the opposite with a quantum 
computer, since (a) Qbits are expensive and fragile and (b) “quantum 
parallelism” makes it possible to produce the state (3.15) with only a 
single execution of the procedure that does the successive squarings, 
thereby relieving us of any need to store all the modulo-N squares, at 
a substantial saving in Qbits. 

As usual with quantum parallelism, there is the major catch that an 
immediate measurement of Qbits in the state (3.15) can reveal only the 
value of a single (random) one of the modulo-N powers of b . But by 
applying Uft to the input register of the state (3.15) and only then 
making the measurement, we can get important collective information 
about the modulo-N values of b x - in this case a divisor of the crucial 
period r — at the (unimportant) price of losing all information about 
the individual values of b x . 


3.9 The unimportance of small phase errors 

To execute the quantum Fourier transform one needs 2-Qbit gates 
Mij = e l7tn i n jf 211 A or, if one exploits the Griffiths—Niu trick, 1-Qbit 

gates V j = e lJtn j / 2J . Since we need to deal with numbers of many 
hundreds of digits, the 2 J appearing in these phase gates can be larger 
than 10 100 . Producing such tiny phase shifts requires a degree of control 
over the gates that is impossible to achieve. Typically such phase-shift 


13 As noted in Chapter 2, any additional registers used in the squaring and 
multiplication subroutines must also be restored to their initial states to 
insure that they are also disentangled from the input and output registers. 
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gates would allow two Qbits to interact in a carefully controlled way 
for an interval of time that was specified very precisely, but obviously 
not to hundreds of significant figures. It is therefore crucial that the 
effectiveness of the period-finding algorithm not be greatly affected by 
small errors in the phase shifts. 

On the face of it this seems worrisome. Since we need to know the 
period r to hundreds of digits, don’t we have to get the phase shifts right 
to a comparable precision? Here the fundamentally digital character of 
the actual output of a quantum computation saves the day. To learn r 
we require the outcomes of several hundreds of 1-Qbit measurements, 
each of which has just two possible outcomes (0 or 1). The action of the 
unitary gates that precede the measurements is like that of an analog 
computer, involving continuously variable phase shifts that cannot be 
controlled with perfect precision. But this analog evolution affects only 
xhz probabilities of the sharply defined digital outputs. Small alterations 
in the phases produce small alterations in the probabilities of getting 
that extremely precise digital information, but not the precision of the 
information itself, once it is acquired. 14 

Suppose that the phase of each term in the quantum Fourier trans¬ 
form (3.18) is incorrect by an amount cp(x, j/), and that each of these 
altered phases is bounded in magnitude by <p <$C 1. The probability 
p(y) in (3.51) will be changed to 


P<p(y) 



m — 1 

e 2nikry/2 n e i(p k {y) 

£=0 


(3.58) 


where <Pk(y) = (p(%o + kr, y). Since all the phases <Pk{y) are small com¬ 
pared with unity, 


e m(y) « j + i(pk{y) , 


and therefore 


Pcpiy) 




l 

2 n m 


m — 1 


£>2^/2 “(i+^Cy)) 


£=0 


(3.59) 


(3.60) 


What effect does this have on the probability of learning from the 
measurement one of the special values y J given in (3.52)? 

We have 


pp(yj) 




l 

2 n m 


m 


-1 


yy^,/2» (1+ ^) 


k=0 


(3.61) 


14 For a long time this crucial point seems to have been discussed only in an 
unpublished internal IBM report by D. Coppersmith. In 2002 that 1994 
report finally appeared: D. Coppersmith, “An approximate Fourier 
transform useful in quantum factoring,” 
http://arxiv.org/abs/quant-ph/0201067. 
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where (pjk = <Pk(yj)- If we expand to linear order in the small quantities 
<Pjk, we get 


P<p(yj) ^ P (.)'/) + 




Inik'rbj/2 n 

C 



(3.62) 

We can get an upper bound on the magnitude of the difference be¬ 
tween the exact and approximate probabilities by replacing the imagi¬ 
nary part of the product of the two sums by the product of the absolute 
values of the sums, and then replacing each term in each sum by its 
absolute value. Since the absolute value of each (pjk is bounded by (p, 
we can conclude that 


2m 2 

I Piyj) - P v (y } )I < = -<p- (3.63) 

Since there are r different yj , the probability of getting one of the 
special values y t is altered by less than 2 cp. So if one is willing to 
settle for a probability of getting a special value that is at worst 1% 
less than the ideal value of about 0.4, then one can tolerate phase 
errors up to <p = 0.4/200 = 1/500. If one leaves out of the quan¬ 
tum Fourier transform circuit all controlled-phase gates e JTin i n j/ 21 ' Jl 
with | i — j | > t , the maximum phase error <p this can produce in any 
term is (p = w7t/2 £ , and therefore the probability will be within 1% of 
its ideal value if 1 /2 € < 1 /(500/77T). 

The number n of Qbits in the input register might be as large as 3000 
for problems of interest (factoring a 500-digit TV). Consequently for all 
practical purposes one can omit from the quantum Fourier transform 
all controlled-phase gates connecting Qbits that are more than about 
i = 22 wires apart in the circuit diagram. This has two major advan¬ 
tages. Of crucial importance, quantum engineers will not have to pro¬ 
duce impossibly precise phase changes. Furthermore, the size of the cir¬ 
cuit executing the quantum Fourier transform has to grow only linearly 
with large n rather than quadratically. Since n is likely to be of order 
10 3 for practical code breaking, this too is a significant improvement. 


3.10 Period finding and factoring 

Since Shor’s period-finding quantum algorithm is always described as 
a factoring algorithm, we conclude this chapter by noting how period 
finding leads to factoring. We consider only the case relevant to RSA 
encryption, where one wants to factor the product of two large primes, 
TV = pq , although the connection between period finding and factoring 
is more general. 

If we have a way to determine periods (such as Shor’s algorithm) 
and want to find the large prime factors of TV = pq , we pick a random 
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number a coprime to TV. The odds that a random a happens to be 
a multiple of p or of q are minuscule when p and q are enormous, 
but if you are the worrying kind you can check that it isn’t, using the 
Euclidean algorithm. (In the overwhelmingly unlikely event that a is 
a multiple of p or q then the Euclidean algorithm applied to a and TV 
will give you p or q directly, and you will have factored TV.) Using our 
period-finding routine, we find the order of a in G pq : the smallest r 
for which 

a r = 1 (mod pq). (3.64) 

We can use this information to factor TV if our choice of a was lucky in 
two ways. 

Suppose first that we are fortunate enough to get an r that is even. 
We can then calculate 


and note that 





(mod pq) 


(3.65) 


0 = x 2 — 1 = (x — l)(v + 1) (mod pq). (3.66) 

Now x — 1 = a r / 2 — 1 is not congruent to 0 modulo pq, since r is the 
smallest power of a congruent to 1. Suppose in addition - our second 
piece of good fortune - that 

x + 1 = a^ 1 + 1^0 (mod pq). (3.67) 


In that case neither x — 1 nor x + 1 is divisible by TV = pq , but (3.66) 
tells us that their product is. Since p and q are prime this is possible 
only if one of them, say p, divides x — 1 and the other, q, divides 
x + 1. Because the only divisors of TV are p and q, it follows that p is 
the greatest common divisor of TV and x — 1, while q is the greatest 
common divisor of TV and x + 1. We can therefore find p or q by a 
straightforward application of the Euclidean algorithm. 

So it all comes down to the likelihood of our being lucky. We show in 
Appendix M that the probability is at least 0.5 that a random number 
a in G pq has an order r that is even with a r ^ ^ — 1 (mod pq). So we 
do not have to repeat the procedure an enormous number of times to 
achieve a very high probability of success. If you’re willing to accept 
the fact that you don’t have to try out very many random numbers a 
in order to succeed, then this elementary argument is all you need to 
know about why period finding enables you to factor TV = pq. But if 
you’re curious about why the probability of good fortune is so high, 
then you must contend with Appendix M, where I have constructed an 
elementary but rather elaborate argument, by condensing a fairly large 
body of number-theoretic lore into the comparatively simple form it 
assumes when applied to the special case in which the number TV is the 
product of two primes. 



Chapter 4 


Searching with a quantum computer 


4.1 The nature of the search 

Suppose you know that exactly one n -bit integer satisfies a certain 
condition, and suppose you have a black-boxed subroutine that acts 
on the N = 2” different n -bit integers, outputting 1 if the integer sat¬ 
isfies the condition and 0 otherwise. In the absence of any other infor¬ 
mation, to find the special integer you can do no better with a classical 
computer than to apply the subroutine repeatedly to different random 
numbers until you hit on the special one. If you apply it to M different 
integers the probability of your finding the special number is M/N. 
You must test \ N different integers to have a 50% chance of success. 

If, however, you have a quantum computer with a subroutine that 
performs such a test, then you can find the special integer with a prob¬ 
ability that is very close to 1 when N is large, using a method that calls 
the subroutine a number of times no greater than (7t/4 )x/7V. 

This very general capability of quantum computers was discovered 
by Lov Grover, and goes under the name of Grover's search algorithm. 
Shor’s period-finding algorithm and Grover’s search algorithm, 
together with their various modifications and extensions, constitute 
the two masterpieces of quantum-computational software. 

One can think of Grover’s black-boxed subroutine in various ways. 
The subroutine might perform a mathematical calculation to determine 
whether the input integer is the special one. Here is a simple example. If 
an odd number p can be expressed as the sum of two squares, m 1 + w 2 , 
then since one of m or n must be even and the other odd, p must be of 
the form 4k + 1. It is a fairly elementary theorem of number theory that 
if p is a prime number of the form 4k + 1 then it can always be expressed 
as the sum of two squares, and in exactly one way. (Thus 5 = 4+1, 
13 = 9 + 4, 17 = 16 + 1, 29 = 25 + 4, 37 = 36 + 1, 41 = 25 + 16, 
53 = 49 + 4, 61 = 36 + 25, etc.) Given any such prime p, the simple- 
minded way to find the two squares is to take randomly selected integers 
v with 1 < v < % with TV the largest integer less than p / 2, until 
you find the one for which J p — x 2 is an integer a. If p is of the 
order of a trillion, then following the simple-minded procedure you 
would have to calculate y/p — x 2 for nearly a million x to have a better 
than even chance of succeeding. But using Grover’s procedure with 
an appropriately programmed quantum computer you could succeed 
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with a probability of success extremely close to 1 by calling the quantum 
subroutine that evaluated yjp — x 2 fewer than a thousand times. 

Mathematically well-informed friends tell me that for this particular 
example there are ways to proceed with a classical computer that are 
much more efficient than random testing, but the quantum algorithm to 
be described below enables even mathematical ignoramuses, equipped 
with a quantum computer, to do better than random testing by a fac¬ 
tor of 1/ y/~N. And Grover’s algorithm will provide this speed-up on 
arbitrary problems. 

Alternatively, the black box could contain Qbits that have been 
loaded with a body of data - for example alphabetically ordered names 
and phone numbers - and one might be looking for the name that went 
with a particular phone number. It is with this kind of application in 
mind that Grover’s neat trick has been called searching a database. 
Using as precious a resource as Qbits, however, merely to store clas¬ 
sical information would be insanely extravagant, given our current or 
even our currently foreseeable ability to manufacture Qbits. Finding a 
unique solution - or one of a small number of solutions, as described in 
Section 4.3 - to a tough mathematical puzzle seems a more promising 
application. 


4.2 The Grover iteration 

Grover’s algorithm assumes that we have been given a quantum search 
subroutine that indicates, when presented with any n -bit integer v, 
whether or not x is the special a being sought, returning this informa¬ 
tion as the value of a function f(x) satisfying 

f(x) — 0, x ^ a\ f(x) = 1, x = a. (4.1) 

Grover discovered a completely general way to do significantly better 
than the classical method of merely letting the subroutine operate on 
different numbers from the list of 2 n candidates until it produces the 
output 1. The quantum-computational speed-up relies on the usual 
implementation of the subroutine that calculates /, in the form of a 
unitary transformation U / that acts on an n-Qbit input register that 
contains x and a 1-Qbit output register that is or is not flipped from 0 
to 1, depending on whether x is or is not the special number a: 

U/(|x) B |j/}i)= \x) n \y © /(x)>i. (4.2) 

An example of a simple circuit that has precisely this action is shown 
in Figure 4.1. The figure can be viewed as providing a minimalist 
version of Grover’s algorithm, reminiscent of the Bernstein-Vazirani 
problem (Section 2.4), though not susceptible to the special trick that 
worked in that simpler case. In this minimalist example we are given a 
black box containing the circuit depicted in Figure 4.1, but are not told 
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Fig 4.1 


A possible 
realization of a black box 
that executes the unitary 
transformation 
U/(|x)„|j/)i) = 

\x)„\y © /(*)>i, where 
f(x) = 0, x / a; 
f(x) = 1, x = a. The 
input register has n = 5 
Qbits and the special 
number a is 10010. The 
6-Qbit gate in the center of 
the figure is a 
five-fold-controlled-NOT, 
which acts on the 
computational basis to flip 
the target bit if and only if 
every one of the five 
control bits is in the state 
|1). The construction of 
such a gate out of more 
elementary gates is shown 
in Figures 4.4-4.7. 


a = 10010 





which of the n control Qbits are acted on by NOT gates - information 
specified by the unknown n- bit integer a. If there were n Qbits in the 
input register and the computer were classical, we could do no better 
than to try each of the N = 2” possible inputs until we found the one 
for which the output register was flipped. But using Grover’s algorithm 
we can determine this information with probability quite close to 1, by 
invoking the search subroutine no more than \f~N = 2”/ 2 times - more 
precisely (7 t/ 4)\/TV times - when N is large. 

As in the Bernstein-Vazirani problem, it is useful to alter the flip of 
the state of the output register into an overall sign change, by trans¬ 
forming the 1-Qbit output register into the state 


H|1> = ^(|0>-|1» (4.3) 

prior to the application of U f. The action of U f is then to multiply the 
(n + 1)-Qbit state by —1 if and only if x = a\ 


U f(\x) <g> H|l» = (-l) /w |x) ® H11). (4.4) 


In this form, the effect of U f on the states \x) ® H11) is exactly the 
same as doing nothing at all to the 1-Qbit output register, while acting 
on the n -Qbit input register with an n-Qbit unitary transformation V 
that acts on the computational basis as follows: 


V|x) = (-l) /w |x> 


x), x a, 

| a), x = a. 


(4.5) 


Since U/ is linear, so is V. Acting on a general superposition |*F) = 
|a , )( l r| v F) of computational basis states, V changes the sign of the 
component of the state along \a), while leaving unchanged the com¬ 
ponent orthogonal to | a): 


V|vF) = |vF) -2\a){a\V). 


(4.6) 
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So we can write V as 


V = 1 — 2\a){a\, (4.7) 

where \a)(a \ is the projection operator 1 on the state | a). 

As we shall see, Uy is the only unitary transformation appearing 
in Grover’s algorithm that acts as anything other than the identity 
on the output register. Because the output register starts in the state 
H11), unentangled with the input register, and because U y maintains 
the output register in this particular state, the output register remains 
unentangled with the input register and in the state H11) throughout 
Grover’s algorithm. We could continue to describe things in terms of 
U f and retain the 1-Qbit output register, expanding (4.6), for example, 
to the form 


U f (\V) ® H11>) = [|*) - 2\a)(a\V)] ® H|l). (4.8) 

But it is simpler to suppress all explicit reference to the unaltered output 
register, which is always unentangled with the input register and always 
in the state H11). We simply replace the ( n + 1)-Qbit unitary U y with 
the n-Qbit unitary V that acts on the n-Qbit input register, and define 
all other operators that appear in the algorithm only by their action on 
the input register, with the implicit understanding that they act as the 
identity on the output register. 

To execute Grover’s algorithm, we once again initially transform 
the n-Qbit input register into the uniform superposition of all possible 
inputs, 

1 2 n — l 

w = ^\ 0)n= J2\x)n. (4.9) 

Z x=0 

In addition to V, Grover’s algorithm requires a second n-Qbit unitary 
W that acts on the input register in a manner similar to V, but with a 
fixed form that does not depend on a . The unitary transformation W 
preserves the component of any state along the standard state |0) given 
in (4.9), while changing the sign of its component orthogonal to 1 0): 

W = 2|0)<0|-1, (4.10) 

where |0)(0| is the projection operator on the state 10). We defer to 
Section 4.3 the not entirely obvious question of how to build W out of 
1- and 2-Qbit unitary gates. 

Given implementations of V and W, Grover’s algorithm is quite 
straightforward. It consists of simply applying many times the product 
WV to the input register, taken initially to be in the state 10). Each 
such application requires one invocation of the search subroutine. 


1 This notation for projection operators is developed in Appendix A. 
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Fig 4.2 


Real linear 
combinations of the special 
state \a), and the uniform 
superposition 

10) = 2 ~ n / 2 Y \ x )-> define a 
plane in which these two 
states are very nearly 
orthogonal. The state | a±) 
in that plane is orthogonal 
to \a), and therefore makes 
a small angle 0 with | 0). 
The unitary transformation 
V takes any vector in the 
plane into its reflection in 
the line through the origin 
along 1 0 j_), so it leaves \a±) 
invariant. The unitary 
transformation W takes 
any vector in the plane into 
its reflection in the line 
through the origin along 
|0), so it rotates | a±) 
counterclockwise through 
the angle 20. Therefore the 
combined operation WV 
rotates | a±) 

counterclockwise through 
20, and since WV is a 
rotation it does the same to 
any vector in the plane. 




WV |aj_> 

I0> 

I a±y = V|a x > 


To see what is accomplished by repeatedly applying WV to the 
initial state |0), note that both V and W acting on either |0) or \a) give 
linear combinations of these two states. Since {a |0) = (<f>\a) = 1/2 W//2 , 
whatever the value of a, the linear combinations have real coefficients 
and are given by 


V| a) = —\a), 
W|0) = |0>, 


v l</>) = 10) - 2^21*>; 

W|a) = ^ 1 4 >) ~ \)a. 


(4.11) 


So if we start with the state 10) and let any sequence of these two 
operators act successively, the states that result will always remain in 
the two-dimensional plane spanned by real linear combinations of |0) 
and | a). Finding the result of repeated applications of WV to the initial 
state |0) reduces to an exercise in plane geometry. 

It follows from the form (4.9) of |0) that |0) and | a), considered as 
vectors in the plane of their real linear combinations, are very nearly 
perpendicular, since the cosine of the angle y between them is given by 


cos y = (a\4>) = 2“” /2 = l/y/N, (4.12) 


which is small when N is large. It is convenient to define | a±) to be 
the normalized real linear combination of |0) and | a) that is strictly 
orthogonal to | a) and makes the small angle 0 = 7t/2 — y with 10), 
as illustrated in Figures 4.2 and 4.3. Since 


sin e = cos Y = 2“ B/2 = 1 /Vn, (4.13) 

0 is very accurately given by 


0 ss 2 -b/2 


when yiV is large. 


(4.14) 
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I ay 



WVI0> 


I0> 

\a±> 

Vl0> 


Fig 4.3 


Since the rotation 
WV rotates any vector in 
the plane of real linear 
combinations of | a) and |0) 
clockwise through an angle 
20, it takes |0) into a vector 
WV|0) that makes an angle 
30 with | a±). This can also 
be seen directly from the 
separate behaviors of V and 
W: V takes |0) into its 
mirror image in \a±), and 
W then takes V|0) into its 
mirror image in |0). 


Since W leaves 1 0) invariant and reverses the direction of any vector 
orthogonal to 10), its geometrical action on any vector in the two- 
dimensional plane containing 10), | a), and \a±) is simply to replace 
the vector by its reflection in the mirror line through the origin along 
|0). On the other hand V reverses the direction of | a) while leaving any 
vector orthogonal to | a) invariant, so it acts on a general vector in the 
two-dimensional plane by replacing it with its reflection in the mirror 
line through the origin along | a±). The product WV, being a product of 
two two-dimensional reflections, is a two-dimensional rotation. 2 The 
angle of that rotation is most easily seen by considering the effect of WV 
on | a±) (see Figure 4.2). The application of V leaves \a±) invariant, 
and the subsequent action of W on \a±) reflects it in the line through 
the origin along the direction of 10). So the net effect of the rotation 
WV on | a±) is to rotate \a±) past |0) through a total angle that is twice 
the angle 0 between \a±) and 10). 

Because WV is a rotation, the result of applying it to any other 
vector in the plane is also to rotate that vector through the angle 20 
in the direction from | a±) to |0). So applying WV to the initial state 
|0) gives a vector rotated away from \a±) by 30 , since 1 0) is already 
rotated away from \a±)by 0 (Figure 4.3). Applying WV a second time 
results in a vector rotated away from |^_l) by 5^, and each subsequent 
application of WV increases the angle between the final state and | a±) 


2 A two-dimensional reflection can be achieved by adding a third dimension 
perpendicular to the plane and performing a 180° rotation with the mirror 
line as axis. This reverses the irrelevant direction orthogonal to the plane. 
The product of two such three-dimensional rotations is also a rotation, takes 
the plane into itself, and does not reverse the third orthogonal direction, so it 
is a two-dimensional rotation in the plane. 
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Fig 4.4 


n -fold-controlled- Z 
transformation, c 77 Z, acts 
as the identity on states of 
the computational basis 
unless all n control Qbits 
are in the state 11), when it 
acts on the target Qbit as Z. 
Here it is constructed out 
of doubly controlled gates, 
using an additional n — 2 
ancilliary Qbits, all initially 
in the state |0). One uses 
2 (n - 2) c 2 X (Toffoli) 
gates and a c 2 Z gate. The 
construction is illustrated 
for the case n = 5. The top 
three wires are the three 
ancillary Qbits. The next 
five wires from the top are 
the five control Qbits, and 
the bottom wire is the 
target Qbit. One easily 
verifies (by applying the 
circuit to computational- 
basis states, with each of 
the ancillary Qbits in the 
state |0)) that Z acts on the 
target Qbit if and only if 
every one of the five 
control Qbits is in the state 
11). The Toffoli gates are 
symmetrically disposed on 
both sides of the diagram 
to ensure that at the end of 
the process each of the 
three ancillary Qbits is set 
back to its initial state |0). 
This is essential if the 
ancillary Qbits are not to 
become entangled with the 
Qbits on which the Grover 
iteration acts, represented 
by the bottom six wires. 


by another 20. Since 6 is very close to 2 77 / 2 , after an integral number 
of applications as close as possible to 

(tt/ 4)2 h/2 , (4.15) 

the resulting state will be very nearly orthogonal to \a±) in the plane 
spanned by 10) and | a) - i.e. it will be very nearly equal to | a) itself. 

Consequently a measurement of the input register in the computa¬ 
tional basis will yield a with a probability very close to 1. We can check 
to see whether we have been successful by “querying the oracle.” If 
f{a) is 1, as it will be with very high probability, this confirms that we 
have found the desired a . If we were unlucky we might have to repeat 
the whole procedure a few more times before achieving success. 


4.3 How to construct W 

It remains to specify how to construct W out of 1- and 2-Qbit unitary 
gates. Now — W works just as well as W for purposes of the search 
algorithm, since it leads to a final state that differs, if at all, only by a 
harmless overall minus sign. It follows from (4.9) and (4.10) and the 
fact that H® 77 is its own inverse that 

-W = 1 - 2|0)(0| = H® n (l - 2|00... 00)(00... OO|)H 0W , (4.16) 

so we need a gate that acts as the identity on every computational- 
basis state except 100 ... 00), which it multiplies by —1. This is just the 
action of an (n — l)-fold-controlled-Z gate, with the roles of the 1-Qbit 
states |0) and |1) interchanged. This interchange is accomplished by 
sandwiching the (n — l)-fold-controlled-Z between X 077 gates, and we 
therefore have 


-W = h 077 X 0w (c w_1 Z)X 0w H 077 . (4.17) 
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Fig 4.5 


An improved version of Figure 4.4, with twice as many gates. 
Gates have been added on the left and right to ensure that the circuit 
works for arbitrary initial computational-basis states of the three 
ancillary Qbits at the top, restoring them to their initial states at the 
end of the computation. To see this note that because Toffoli gates or 
c 2 Z gates are their own inverses, the circuit acts as the identity on 
those computational-basis states of all nine Qbits in which any one of 
the five control Qbits (second through sixth wires from the bottom) is 
in the state 10), regardless of the computational-basis states of the 
other Qbits. This is because, as an examination of the figure reveals, 
replacing the gate governed by any one of the five control Qbits by the 
identity always results in a pairwise cancellation of all the remaining 
gates. It remains only to confirm that when all five control Qbits are in 
the state 11), the circuit acts as Z on the target Qbit at the bottom, and 
the state of the three ancillary Qbits at the top is unchanged. This is 
established in Figure 4.6, which shows the operation of the gates in 
Figure 4.5 when the five control Qbits are all in the state 11). Because 
X = HZH one can also use this circuit to produce a multiply- 
controlled-NOT gate, by applying Hadamard gates to the bottom wire 
on the far right and left. 


We can construct W by constructing c w-1 Z, the (n — 1 ^fold- 
controlled- Z. 

Figure 4.4 shows a straightforward but not terribly efficient way to 
make a c w-1 Z gate for the case n = 6. We use n — 3 ancillary Qbits, all 
initially in the state |0), 2{n — 3) c 2 X (Toffoli) gates, and one c 2 Z gate. 
As explained in Section 2.6, these can all be built out of 1- and 2-Qbit 
gates. It is essential for the success of the algorithm that each ancillary 
Qbit be restored to its initial state |0), since our analysis of the Grover 
algorithm assumes that the input and output registers have states of 
their own, unentangled with any other Qbits, after each application of 
W and V. 
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Fig 4.6 


reproduces what remains 
of Figure 4.5 when all five 
control Qbits are in the 
state 11). One easily verifies 
that two identical cNOT 
gates, separated by a NOT 
acting on their control 
Qbit, have exactly the same 
action on the 
computational basis as 
NOT gates acting on both 
the control and target 
Qbits. As a result each of 
the two identical sets of five 
adjacent gates acting on the 
three ancillary Qbits at the 
top of part (a) reduces 
simply to three NOT gates, 
as shown in part (b). 
Making this further 
simplification in part (a), 
note that because each of 
the three ancillary Qbits is 
acted on by two NOT 
gates, its state is unaltered. 
The two NOT gates acting 
on the upper wire also 
ensure that precisely one of 
the two cZ gates applies Z 
to the bottom Qbit, 
whatever the state of the 
upper wire. 
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The construction of Figure 4.4 is rather expensive in Qbits, requir¬ 
ing n — 3 ancillas to apply the algorithm to an n -bit set of possibilities 
for the special number a. At a cost of four times as many Toffoli gates, 
one can reduce the number of ancillas to a single one, regardless of the 
size of n . The way to do this is developed in Figures 4.5-4.7. Figures 4.5 
and 4.6 show how nearly doubling the number of gates makes it possible 
for the construction of Figure 4.4 to work for arbitrary initial states of 
the ancillas. Figure 4.7 then shows how, by an additional doubling, one 
can, with the aid of a single ancilla, divide an w-fold-controlled-Z into 
two multiply-controlled-NOT gates and two multiply-controlled-Z 
gates, each acting on about )yn Qbits. (Since X = HZH, one can convert 
a multiply-controlled-Z gate into a multiply-controlled-NOT gate by 
applying Hadamard gates to the target Qbit at the beginning and end 
of the circuit.) The multiply-controlled-Z gates in Figure 4.7 are able 
nondisruptively to use the control Qbits of the multiply-controlled- 
NOT gates as their ancillary Qbits in the construction of Figure 4.5. 
And the multiply-controlled-NOT gates in Figure 4.7 can make similar 
use of the control Qbits of the multiply-controlled-Z gates. 


4.4 Generalization to several special numbers 

If there are several special numbers, essentially the same algorithm can 
be used to find one of them, if we know how many there are. The 
function f in (4.1) now becomes 


/(*)=!, 


x = a i 


a 


m 


(4.18) 


f(x) — 0, x ^ a i, .. a m \ 
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Fig 4.7 


The identity illustrated by the circuit is easily confirmed. 
There is only one ancilla, whose state is left unchanged. By 
introducing circuits of the form in Figure 4.5 into this circuit one can 
produce c ;? Z or c n X gates with the aid of just a single ancilla. (Since 
X = HZH Figure 4.5 works for either type.) In constructing each of 
the multiply-controlled-NOT gates in Figure 4.7 out of Toffoli gates, 
one can borrow the control Qbits of the multiply-controlled-Z gates to 
use as ancillary Qbits in the expansions of Figure 4.5, since those 
expansions work whatever the state of their ancillary Qbits, and restore 
that state to its original form. For the same reasons one can also 
borrow the control Qbits of the multiply-controlled-NOT gates to 
construct the multiply-controlled-Z gates. 


The ^-Qbit unitary transformation V extracted from (4.4) becomes 
one whose action on computational-basis states in the input register is 
given by 



x 7 ^ a\, 


a 


m 5 


if we replace the state ) by 



x = a i 


a 


m 


(4.19) 



(4.20) 


then starting with 10 ), which continues to have the form (4.9), the 
transformations V and W now keep the state of the input register in 
the two-dimensional plane spanned by the real linear combinations of 
\i/f) and |0). The unitary transformation V changes the sign of |0) but 
preserves the linear combination of | 0 ) and | 0 ) orthogonal to | 0 ), so 
V is now a reflection in the line through the origin along the vector 
10j_) perpendicular to |0) in the plane. Everything else is just as in the 
case of a single special number except that now the angle 0 between 
| 0 _l) and | 0 ) satisfies 


sin© = cos(7r/2 — 0) = (0|0) = /2 n . 


(4.21) 
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When m /2 n « 1, we can arrive at a state very close to \\jr) with 

(tt/4)2 w/2 /V^ (4.22) 

applications of WV. A measurement then gives us, with a probability 
very close to 1, a random one of the special values a x . Note that the 
mean number of invocations of the subroutine decreases only as 1 / m 
with the number m of marked items, in contrast to a classical search, 
for which doubling the number of acceptable solutions would halve the 
time of the search. When m / 2 n is not small we have to reexamine the ex¬ 
pression (4.22) for the optimal number of iterations, but at that point the 
quantum search offers little significant advantage over a classical one. 

We must know how many special numbers there are for the procedure 
to work, since we have to know how many times to do the Grover 
iteration before making our measurement. By exploiting the fact that 
the Grover iteration is periodic, restoring the initial state after about 
n 2 ”/ 2 /*Jm iterations, it is possible to combine Grover iterations with a 
clever application of the quantum Fourier transform to learn the value 
of m with enough accuracy to enable one then to apply the Grover 
iteration the right number of times to ensure a high probability of 
success, even when m is not known at the start. 


4.5 Searching for one out of four items 

The simplest nontrivial application of Grover’s algorithm is to the case 
n = 2, or N = 4. (When n — la single invocation of the subroutine 
suffices to identify a even with a classical computer.) When n — 2, 
(4.13) tells us that sin 6 = ^so 9 = 30°. Consequently 30 = 90°, and 
the probability of identifying a with a single invocation of the subrou¬ 
tine is exactly 1. 

This is a significant improvement on the classical computer, with 
which one can do no better than trying each of the four possibilities 
for a in random order. It is equally likely that the marked item will 
be the first, second, third, or fourth on the list. Since the probability 
is \ that the marked item is first on the list, 2 that it is second, and 
^ + -j; = 2 that it is third or fourth, the mean number of attempts 
is 2x1 + 2x2+2x3 = 22. (It is not necessary to make a fourth 
attempt, since if the first three attempts fail to produce a, then one 
knows that a is the one remaining untested number.) 

The case n = 2 is also special in that one does not have to resort 
to the elaborate procedure specified in Figures 4.4-4.7 to produce the 
n -fold-controlled -Z gate. A single Toffoli gate sandwiched between 
Hadamards on the target Qbit does the job. 



Chapter 5 


Quantum error correction 


5.1 The miracle of quantum error correction 

Correcting errors might sound like a dreary practical problem, of little 
aesthetic or conceptual interest. But aside from being of crucial im¬ 
portance for the feasibility of quantum computation, it is also one of 
the most beautiful and surprising parts of the subject. The surprise 
is that error correction is possible at all, since the only way to detect 
errors is to make measurements, but measurement gates disruptively 
alter the states of the measured Qbits, apparently making things even 
worse. “Quantum error correction” would seem to be an oxymoron. 
The beauty lies in the ingenious ways that people have found to get 
around this apparently insuperable obstacle. 

The discovery in 1995 of quantum error correction by Peter Shor 
and, independently, Andrew Steane had an enormous impact on the 
prospects for actual quantum computation. It changed the dream of 
building a quantum computer capable of useful computation from a 
clearly unattainable vision to a program that poses an enormous but 
not necessarily insuperable technological challenge. 

Error correction is not a major issue for classical computation. In a 
classical computer the physical systems that embody individual bits - 
the Chits - are immense on the atomic scale. The two states of a Chit 
representing 0 and 1 are so grossly different that the probability is 
infinitesimal for flipping from one to the other as a result of thermal 
fluctuations, mechanical vibrations, or other irrelevant extraneous in¬ 
teractions. 

Error correction does become important, even classically, in the 
transmission of information over large distances, because the farther 
the signal travels, the more it attenuates. One can deal with this in a 
variety of straightforward or ingenious ways. One of the crudest is to 
encode each logical bit in three actual bits, replacing |0) and 11) by the 
codewords 

10) = |0)|0)|0) = |000), |T) = |1)|1)|1) = |111). (5.1) 


One can then monitor each codeword, checking for flips in any of 
the individual Chits and restoring them by applying the principle of 
majority rule whenever a flip is detected. Monitoring has to take place 


99 


100 


QUANTUM ERROR CORRECTION 


often enough to make negligible the probability that more than a single 
bit has flipped in a single codeword between inspections. 

Quantum error correction also uses multi-Qbit codewords and also 
requires monitoring at a rate that renders certain kinds of compound 
errors highly improbable. But there are several ways in which error 
correction in a quantum computer is quite different. 

(a) A quantum computer, unlike a classical computer, requires error 
correction. The physical Qbits are individual atomic-scale physical 
systems such as atoms, photons, trapped ions, or nuclear magnetic 
moments. Any coupling to anything not under the explicit con¬ 
trol of the computer and its program can substantially disrupt the 
state associated with those Qbits, entangling them with computa¬ 
tionally irrelevant features of the computer or the world outside 
the computer, thereby destroying the computation. For a quantum 
computer to work without error correction, each Qbit would have to 
be impossibly well isolated from irrelevant interactions with other 
parts of the computer and anything else in its environment. 

(b) In contrast to classical error correction, checking for errors in a 
quantum computer is problematic. The obvious way to monitor a 
Qbit is to measure it. But the result of measuring a Qbit is to alter 
its state, if it has one of its own, and, more generally, to destroy 
its quantum correlations with other Qbits with which it might be 
entangled. Such disruptions are stochastic - i.e. unpredictable - 
and introduce major errors of their own. One must turn to less 
obvious forms of monitoring. 

(c) Bit flips are not the only errors. There are entirely nonclassical 
sources of trouble. For example phase errors, such as the alteration 
of 10) + |1) to |0) — |1), can be just as damaging. 

(d) Unlike the discrete all-or-nothing bit-flip errors suffered by Chits, 
errors in the state of Qbits grow continuously out of their uncor¬ 
rupted state. 

We begin our discussion of error correction by examining in 
Section 5.2 a simple model of quantum error correction that works 
when the possible errors are artificially limited to a few specific kinds 
of disruption. Although this is clearly unrealistic, the error-correction 
procedure is easy to follow. It also introduces in a simple setting most 
of the tricks that continue to work in the more realistic case. 


5.2 A simplified example 

Much of the flavor of quantum error correction is conveyed by an 
artificially simple model in which the only errors a collection of Qbits 
is allowed to experience are the classically meaningful errors: random 
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flips of individual Qbits. We shall examine the more general possibilities 
for quantum errors in Section 5.3 below. 

Bit-flip errors in a computation can be modeled by a circuit that 
differs from the ideal error-free circuit only in the occasional presence 
of extraneous 1-Qbit NOT gates. If such randomly occurring error- 
producing NOT gates are sufficiently rare, then since the only allowed 
errors are bit-flip errors, one might hope to be able to correct the 
corruptions they introduce by tripling the number of Qbits and using 
precisely the 3-Qbit code (5.1) that corrects for bit-flip errors in the 
classical case. Because of the disruptive effect of measurement gates 
in diagnosing errors, it is not at all clear that such a 3-Qbit code can 
be effective for bit-flip errors in the quantum case. It can nevertheless 
be made to work, though the way in which one does the encoding and 
performs the error correction is much subtler for Qbits than it is for 
Chits. 

To begin with, there is the question of encoding. Classically one 
merely replaces each of the two computational-basis states \x) by the 
codeword states |v) = |v)|v)|v), for x = 0 or 1. Qbits, however, can 
also be in superpositions alO) + /3|1), and one requires a circuit that 
automatically encodes this into a 10 ) +/3|1) = a|0)|0)|0) + j8|l)|l) | 1 ) 
for arbitrary a and /3, in the absence of any knowledge of what the values 
of a and /3 might be. This can be done with two cNOT gates that target 
two additional Qbits initially both in the state |0), as illustrated in 
Figure 5.1: 


Fig 5.1 


A unitary circuit 
that encodes the 1-Qbit 
state a 10 ) + 11 ) into the 

3-Qbit code state 
a 1000 ) + P | 111 ), using 
two cNOT gates and two 
other Qbits each initially in 
the state |0). The circuit 
clearly works for the 
computational-basis states 
| 0 ) and 11 ), and therefore, 
by linearity, it works for 
arbitrary superpositions. 


«|0) +P\\) =a|0)|0}|0> +j8|1>|1>|1> = C 21 C 20 (a|0> + j8|l>)|0>|0>. 

(5-2) 

Having produced such a 3-Qbit codeword state, we must then guard 
against its corruption by the possible action of an extraneous NOT 
gate that acts on at most one of the three Qbits, as illustrated in 
Figure 5.2. This is easily done for Chits, for which there are only 
two possible uncorrupted initial states, 1000 ) and 1111 ), and examining 
them is unproblematic. To see whether either initial state has been cor¬ 
rupted by the action of a single NOT gate, one nondisruptively reads 
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Fig 5.2 


The encoded state 
of Figure 5.1 may or may 
not be corrupted by the 
action of a single 
extraneous NOT gate. The 
error-inducing gates are 
depicted in a lighter font - 
X instead of X - and inside 
a noisy-looking corrupted 
box. 



the three Cbits. If this reveals all three Cbits to be in the same state, 
there is no corruption to correct. If one of them is found to be in a 
different state from the other two, that particular Cbit is the one that 
was acted upon by the extraneous NOT gate, and applying a second 
NOT gate to that Cbit restores the initial state. 

One cannot, however, nondisruptively “read” the state of a collec¬ 
tion of Qbits. The only way to extract information is by the action of 
measurement gates. But measuring any of the three Qbits immediately 
destroys the uncorrupted superposition 

|vF) =a|000)+j8|lll), (5.3) 

converting it either to 1000) (with probability \a\ 2 ) or to 1111) (with 
probability \/3\ 2 ). There is a similar coherence-destroying effect on 
each of the three possible corrupted states, 

l*o> =X 0 |vF) =a|001>+01110), 

I'Pi) =X 1 |'P) =a|010)+j8|101), (5.4) 

|vF 2 ) = X 2 \H>) =a|100)+j8|011), 

obliterating any dependence of the post-measurement state on the com¬ 
plex amplitudes a and /3. This might appear (and for some time was 
thought) to be the end of the story: quantum error correction is im¬ 
possible because of the disruptive effect of the measurement needed to 
diagnose the error. 

But there are subtler ways to extract the information needed to di¬ 
agnose and correct possible errors. Although there continues to be a 
disruption in these refined procedures, the damaging effects are shifted 
from the codeword Qbits to certain ancillary Qbits. By coupling the 
codeword Qbits to these ancillary Qbits with appropriate 2-Qbit unitary 
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gates, and then applying measurement gates only to the ancillas, one can 
extract information about certain relations prevailing among the code¬ 
word Qbits. This more limited information turns out to be enough to 
diagnose and correct certain errors in a coherence-preserving manner, 
without revealing anything about the original uncorrupted codeword 
state. Acquiring no information about the uncorrupted state is a neces¬ 
sary restriction on any error-correction procedure capable of perfectly 
restoring the uncorrupted state. If one could get even partial infor¬ 
mation about the structure of a state without disrupting it, one could 
continue collecting additional information nondisruptively until one 
was well on the way to violating the no-cloning theorem. 

Note that all possible forms for the uncorrupted 3-Qbit codeword 
(5.3) - given by assigning all possible values to the amplitudes a and /3 - 
lie in a two-dimensional subspace of the full eight-dimensional space 
containing all possible 3-Qbit states. Furthermore, each of the three 
allowed corruptions (5.4) also lies in a two-dimensional subspace of 
the full 3-Qbit space, and the three subspaces containing the three 
allowed corruptions are each orthogonal to the subspace containing 
the uncorrupted codeword, and orthogonal to each other. This turns 
out to be crucial to the success of the enterprise. 

More generally, if we wanted to use an w-Qbit codeword in a model 
in which the only allowed errors were flips of a single Qbit, then we 
would require 2(1 + n) dimensions to accommodate the// + 1 mutually 
orthogonal two-dimensional subspaces associated with a general un¬ 
corrupted state and its n different 1-Qbit corruptions. Since all possible 
states of n Qbits span a 2 n -dimensional space, a necessary condition 
for an /z-Qbit bit-flip-error-correcting code to be possible is 

2* -1 >l + ». (5.5) 

The smallest?/ satisfying (5.5) is n = 3, for which it holds as an equality. 
This shows that the 3-Qbit code is, in this sense, perfect for the purpose 
of correcting errors limited to flips of a single Qbit. 

Figure 5.3 shows that 3-Qbit codewords, as well as meeting this 
necessary condition for the correction of quantum bit-flip errors, actu¬ 
ally do permit it to be carried out. The error detection and correction 
requires two additional ancillary Qbits (the upper two wires), initially 
both in the state |0). Both ancillas are targeted by pairs of cNOT gates 
controlled by subsets of the three codeword Qbits. Note first that if 
the 3-Qbit codeword has not been corrupted, so its state remains (5.3), 
then both the ancillary Qbits remain in the state |0) after the action of 
the cNOT gates, since the term 1000) in the codeword results in none 
of the target Qbits being flipped, while the term 1111 > results in both 
of the target Qbits being flipped twice, which is equivalent to no flip. 

In a similar way each of the three corruptions (5.4) results in a 
different unique final state for the ancillary Qbits. The first of those 
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a|000) + 

J3|lll> 



l°> 0 0 




or 



or 



random flip of 
at most 1 Qbit 



x 





0=1 

1=0 



a|000) + 

J8|lll> 


Fig 5.3 


How to detect and correct the three possible single-bit-flip 
errors shown in Figure 5.2. One requires two ancillary Qbits (the upper 
two wires), each initially in the state |0), coupled to the codeword 
Qbits by cNOT gates. After the cNOT gates have acted each ancilla is 
measured. If both measurements give 0, then none of the erroneous 
NOT gates on the left have acted and none of the error-correcting 
NOT gates on the right need to be applied. If the upper measurement 
gate shows x = 1 and the lower one shows y = 0, then the uppermost 
of the three erroneous NOT gates has acted on the left. Its action is 
undone by applying the uppermost of the three NOT gates on the 
right. The other two possible 1-Qbit errors are similarly corrected. 


corruptions results in |0) for the upper ancilla and |1) for the lower, 
since either term in the superposition a | 001) + yS 1110) results in zero 
or two flips for the upper ancilla, and a single flip for the lower ancilla. 
The next form in (5.4) produces a single flip for both ancillas, resulting 
in 11) for both. The third results in 11) for the upper and |0) for the 
lower ancilla. 

So if the two ancillary Qbits are measured after the cNOT gates 
have acted, the four possible readings, 00, 01, 10, and 11, of the two 
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measurement gates reveal whether or not a random one of the codeword 
Qbits has been flipped and, in the event of a flip, which of the three has 
suffered it. On the basis of this information one can either accept the 
codeword as uncorrupted or apply a NOT gate to the Qbit that has been 
identified as the flipped one, thereby restoring the initial uncorrupted 
state. One easily confirms that this is precisely what is accomplished 
by the NOT gates on the extreme right of Figure 5.3. 

This accomplishes what any valid quantum error-correction pro¬ 
cedure must do: it restores the original uncorrupted state without re¬ 
vealing any information whatever about what the form of that state - 
the particular values of the amplitudes a and /? - might actually be. 
The procedure succeeds in preserving the superposition by extracting 
information only about correlations among the Qbits making up the 
codeword, without ever extracting information about individual Qbits. 
Working only with correlations makes it possible to apply a single linear 
test that works equally well for diagnosing 1-Qbit errors in either |000) 
or 1111), and therefore also works for any superposition of those states. 

This simple example of quantum error correction requires the use 
of measurement gates to diagnose the error. The outputs of the mea¬ 
surement gates are noted, and then used to determine which, if any, of a 
collection of error-correcting NOT gates should be applied. The pro¬ 
cedure can be automated into a bigger quantum circuit that eliminates 
(or almost eliminates) the need to use measurement gates combined 
with unitary gates, which are or are not applied depending on the read¬ 
ings of the measurement gates. This can be achieved by a combination 
of cNOT and Toffoli gates, controlled by the ancillary Qbits, as shown 
in Figure 5.4. 

Replacing measurement gates by additional cNOT gates does not 
entirely eliminate the need for measurement, because at the end of the 
process the state of the ancillary Qbits will depend on the character of 
the error and will in general no longer be the state |0> |0> with which the 
error-correction procedure starts. If one wishes to reuse these ancillary 
Qbits for further error correction, it is necessary to reset them to their 
initial state 10) 10). This can efficiently be done by measuring them and 
applying the appropriate NOT gates if either is found to be in the state 
|1). Thus measurement gates followed by NOT gates that depend on 
the measurement outcome are still needed to prepare the circuit for a 
possible future error correction. 

This procedure (automated or not) will continue to work even when 
the codeword Qbits are entangled with many other codeword Qbits, as 
they will be in the course of a nontrivial computation. In such a case 
the codeword Qbits have no state of their own, the state of all the many 
codeword Qbits being of the form 


a|000)hP) +j8|lll)|<&), 


(5.6) 
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random flip of 
at most 1 Qbit 


Fig 5.4 


Automation of the error-correction process of Figure 5.3. The 
three controlled gates on the right - one of them a doubly controlled 
Toffoli gate with multiple targets - have precisely the same 
error-correcting effect on the three codeword Qbits as does the 
application of NOT gates contingent on measurement outcomes in 
Figure 5.3. The final state |4Q of the ancillas (which is also the state 
that determines the action of the three controlled gates on the right) is 
100) if none of the erroneous NOT gates on the left has acted. It is 110) 
if only the upper erroneous NOT gate has acted, 111) if only the 
middle one has acted, and |01) if only the lower one has acted. 


with the error correction applied to the three Qbits on the left. 
One easily confirms that the added complication of entanglement 
with other Qbits has no effect on the validity of the error-correction 
procedure. 

There is an alternative way of representing the use of cNOT gates in 
Figure 5.3 to diagnose the error, which is useful in correcting quantum 
errors in more realistic cases. The alternative point of view is based 
on the easily confirmed fact that the uncorrupted state (5.3) is left 
unaltered by either of the operators l~iL\ and ZiZo, while the three 
corruptions (5.4) are each eigenstates of Z 2 Z 1 and ZiZq with distinct 
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Table 5.1. Two operators that diagnose the error syndrome 
for the 3-Qbit code that protects against bit-flip errors. The 
four entries in each of the two rows indicate whether the 
operator for that row commutes (+) or anticommutes (—) with 
the operators at the top of the four columns 



X 2 

Xi 

Xo 

1 

Z 2 Z, 

— 

— 

+ 

+ 

Z1Z0 

+ 

— 

— 

+ 


sets of eigenvalues: 1 and — 1 in the case of | 'Pq); — 1 and — 1 in the case 
of hPi>; and —1 and 1 in the case of | x I / 2) • 

While these last three facts can be confirmed directly from the ex¬ 
plicit forms of I'I'o), l^i), and |^) on the right of (5.4), it is worth 
noting, for purposes of comparison with some of the more complex 
cases that follow, that they also follow from the facts that 7-yL\ and 
ZiZo act as the identity on the uncorrupted state |*P), that the cor¬ 
rupted states are of the form |*P 7 ) = X 7 |*P}, and that X 7 commutes 
with Z i when i 7^ y , while X 7 anticommutes with Z 7 : Z / X 7 = — X 7 Z 7 . 
The resulting pattern of commutations (+) or anticommutations (—) 
is summarized in Table 5.1. 

Thus the joint eigenvalues of the commuting operators Z 2 Z 1 and 
ZiZq distinguish among the uncorrupted state and each of the three 
possible corruptions. A procedure that takes advantage of this by 
sandwiching controlled Z 2 Z 1 and controlled ZiZq gates between 
Hadamards acting on the control Qbits is shown in Figure 5.5. 
Although it takes a little thought to confirm directly from the fig¬ 
ure that Figure 5.5 does indeed accomplish error correction - we shall 
work this out in Section 5.4 as a special case of a much more gen¬ 
eral procedure - one can confirm that it does by simply noting that 
Figure 5.5 is mathematically equivalent to Figure 5.3. This equiva- 
lence follows from the facts that X = HZH, that H = 1, and that the 
action of controlled-Z is unaltered by exchanging the target and control 
Qbits. 

This oversimplified example, in which only bit-flip errors are al¬ 
lowed, illustrates most of the features of quantum error correction that 
one encounters in more realistic cases. The more general procedure 
is complicated by the fact that, as noted above and made precise in 
Section 5.3 below, the general error a Qbit can experience is more 
complicated than just a single bit flip. As a result, one needs codewords 
containing more than three Qbits to correct general single-Qbit errors, 
and one requires more complicated diagnostic and corrective proce¬ 
dures than those of Figures 5.3-5.5, involving more than just a pair 
of ancillary Qbits. But although the codewords and error-correcting 
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ooo) 
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/3|lll> 


at most 1 Qbit 


1=0 


Fig 5.5 


An apparently unnecessary complication of the 
error-correcting circuit in Figure 5.3, which transforms it into the 
more general form described in Section 5.4. The circuit is equivalent 
to that in Figure 5.3: (1) the cNOT gates in Figure 5.3 can be replaced 
by controlled-Z gates if Hadamard gates act on each ancilla before and 
after the controlled gates act; (2) each of the four controlled-Z gates 
acts in the same way if its control and target Qbits are interchanged; 
and (3) pairs of controlled gates with the same control Qbit and two 
different targets can be combined into a single controlled gate with 
that control Qbit and a 2-Qbit target operation that is just the product 
of the two 1-Qbit target operations. The part of the circuit between 
and including the pairs of Hadamards on the right and left is a simple 
example of the more complex error-diagnosing circuits that appear in 
Figures 5.8, 5.9, and N.2 (in Appendix N). 


circuits are more elaborate, once we have identified the more general 
form of quantum errors there are no further conceptual complications 
in understanding the kinds of procedures that can correct them. 

The more general form Qbit errors can assume is discussed in 
Section 5.3. Somewhat surprisingly, it turns out that the general 1- 
Qbit error can be viewed as a simple extension of what we have just 
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described: in addition to the possibility of an extraneous X gate acting 
on the Qbit, there might also be an extraneous Z gate or an ext¬ 
raneous Y = ZX gate. If we can diagnose and correct for each of these 
three possible corruptions, then we can correct for arbitrary 1-Qbit 
errors. 

Section 5.4 describes a generalization of the diagnostic scheme we 
have just exploited for extracting relational information about the Qbits 
that make up a codeword, by coupling groups of them to ancillary 
Qbits, which are then measured. It turns out that the operators needed 
to diagnose the error - generalizations of the operators Z 2 Z 1 and ZiZq 
for the 3-Qbit code - are also useful for defining the more general 
codewords. 

In Section 5.5 we examine two of the most important n-Qbit codes 
with n > 3 that are able to correct general single-Qbit errors: the 5- 
Qbit and 7-Qbit codes. The 5-Qbit code is the ideal code for general 
1-Qbit errors in the same way that the 3-Qbit code is ideal for bit-flip 
errors. The 7-Qbit code is more likely to be of practical interest, for 
reasons we shall describe. The earliest quantum error-correcting code - 
the 9-Qbit code discovered by Shor - is now of only historical interest, 
and is relegated to Appendix N. 


5.3 The physics of error generation 

Errors are not, of course, produced by extra gates accidentally appear¬ 
ing in a circuit, as in the oversimplified example of Section 5.2. They 
are produced by extraneous interactions with the world external to the 
computer or with computationally irrelevant degrees of freedom of 
the computer itself. Although one would like the state of the Qbits to 
evolve only under the action of the unitary transformations imposed 
by the gates of the computer, inevitably Qbits will interact, even if only 
weakly, with other physical systems or degrees of freedom having noth¬ 
ing to do with the computation in which the Qbits are participating. 
In a well-designed computer such spurious interactions will be kept 
to a minimum, but their disruptive effects on the quantum state of 
the Qbits can grow continuously from zero, in contrast to disruptive 
effects on Cbits, which have to exceed a large threshold before a Cbit 
can change its state. In a quantum computer such spurious changes of 
state will eventually accumulate to the point where the calculation falls 
apart, unless ongoing efforts are made to eliminate them. 

To characterize the most general way in which a Qbit can be deflected 
from its computational task, we must finally acknowledge that Qbits are 
not the only things in the world that are described by quantum states. 
The quantum theory provides the most fundamental description we 
have of everything in the world, and it describes everything in the 
world - not just Qbits - by means of quantum states. 
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This spectacular expansion of the scope of quantum states might not 
come as a complete surprise to the nonphysicist reader. I have stressed 
all along that the quantum state of a Qbit or a collection of Qbits is not 
a property carried by those Qbits, but a way of concisely summarizing 
everything we know that has happened to them, to enable us to make 
statistical predictions about the information we might then be able to 
extract from them. If quantum states are not properties inherent in the 
system they describe, but states of the knowledge we have managed 
to acquire about the prior history of the system - if they somehow 
incorporate fundamental aspects of how we exchange information with 
the world outside of us - then they might indeed have an applicability 
going beyond the particular kinds of systems we have applied them to 
up until now. 

Indeed, nowhere in this exposition of quantum computation has it 
been necessary to refer to the individual character of the Qbits. Whether 
they are spinning electrons, polarized photons, atoms in cavities, or 
any number of other things, the quantum-mechanical description of 
their computational behavior has been exactly the same. So insofar 
as the assignment of quantum states to physical systems is a general 
feature of how we come to grips with the external world, it might 
not be unreasonable to assign a quantum state \e) to whatever part 
of the world comes into interactive contact with the Qbit or Qbits - 
their environment. We will not make any specific assumptions about the 
character of that environment or of the quantum state \e) associated 
with it, beyond noting that, unlike the state of a single Qbit, the state 
of the environment is likely to be a state in a space of enormously many 
dimensions if there is any complexity to the environment that couples, 
however weakly, to the Qbit. 

If, in spite of this recommended point of view, you still feel un¬ 
comfortable applying quantum states to noncomputational degrees of 
freedom, then I invite you to regard \e) as the state of some enormous 
collection of extra Qbits, from which one would like the computation 
to be completely decoupled, but which, for reasons beyond our con¬ 
trol, somehow manage to interact weakly with the Qbits we are actually 
interested in. I offer this invitation as a conceptual aid to computer 
scientists uncomfortable with my claim that quantum states apply to 
the description of arbitrary physical systems. But I also note that in re¬ 
cent years a few physicists have suggested that the entire world should 
indeed be viewed as an enormous collection of Qbits - a position that 
has not attracted many adherents to date. 

Returning from grand world views to the practical reality of errors 
in a quantum computation, we shall regard a single Qbit, initially in 
the state \x) (x = 0 or 1), as being part of a larger system consisting 
of the Qbit plus its environment, initially in the state |^}|x). In the 
ideal case, as the Qbit evolves under 1-Qbit unitary gates or interacts 
with other Qbits under 2-Qbit unitary gates, it stays unentangled with 
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its environment. The environmental component of the state is then 
irrelevant to the computational process and can be ignored, as we have 
been doing up to now. 

Unfortunately, however, interactions with the environment will in 
general transform and entangle the states of the Qbit and its environ¬ 
ment. The most general way in which this can come about can be 
expressed in the form 

k)|0) -> ko)|0) + ki)|l), 

k)11> -> k2>|0> + k3>|i>, 

where \e) is the initially uncorrelated state of the environment and 
ko), • •k3) are possible final environmental states. The environmen¬ 
tal final states are not necessarily orthogonal or normalized, and are 
constrained only by the requirement that the two states on the right 
side of (5.7) should be orthogonal, since the Qbit-environment in¬ 
teraction is required, like any other physical interaction, to lead to a 
unitary development in time. This corruption of a computation by the 
entanglement of the state of Qbits with the state of their environment 
is called decoherence. It is the primary enemy of quantum computation. 

Included in (5.7) are cases like the oversimplified one we examined 
in Section 5.2, in which the Qbit remains isolated from the environ¬ 
ment (k*) = d%V), * = 0, ..3) but still suffers in that isolation an 
unintended unitary evolution. But (5.7) also includes the case of major 
practical interest. This is the case in which the interaction with the 
environment has a small but otherwise quite general entangling effect 
on the Qbit: 


*o) ^ k3> ^ k); (ukik (*2k2> < i- (5.8) 


In dealing with such entangling interactions with the environment, 
it is useful to introduce projection operators 


P 


X 


1 +(-l)*Z 


(5.9) 


which project onto the 1-Qbit states \x), x = 0 , 1 . Using these 
projection operators, we can combine the two time evolutions in (5.7) 
into a single form: 


\e)\x) ([ko>l + ki>X]P 0 )k> + ([kz>X + k3>l]Pi)k>. (5.10) 

In (5.10) I have introduced the convenient notation |^)U to describe 
the linear operator from a 1-Qbit to a many-Qbit space that takes 
the 1-Qbit state \\jf) into the many-Qbit state \e )® u|V0; like most 
embellishments of Dirac notation it is defined so that the appropriate 
form of the associative law holds: 


(k)u)lV0 = k) 0 U|V0- 


(5.11) 
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Using the explicit form (5.9) of the two projection operators, defining 


l 



(5.12) 


and continuing to use the notational convention (5.11), we can rewrite 
(5.10) as 


\e)\x) 


ko) + k3) 1 ko) - k3) 


, k2> + ki> v , k2> - ki> 

H i A H - 


Y I \x). (5.13) 


There is nothing special about the particular environmental states 
appearing in (5.13), so we can rewrite it more compactly in terms of four 
other (in general neither normalized nor orthogonal) states \ a), | b), \ c), 
and \d) of the environment as 


\e)\x) -> (|i)l + |«)X + \b)Y+ \c)Z)\x). (5.14) 

The time development represented by the arrow in (5.14) is unitary and 
therefore linear, so the combination of environmental states and unitary 
operators on the right side of (5.14) acts linearly on \x). Therefore 
(5.14) holds not only for |^)|0) and |^)|1) but also for any superposition 
ak>|0)+j8k)|l) = k)(a|0>+/J|l>) = \e)\^),m the form 

\e)\f) -* (\d)l + \a)X + \b)Y + \c}Z)\is). (5.15) 

The actions of X, Z, and Y on the uncorrupted state \i/f) are some¬ 
times described as subjecting the Qbit to a bit-flip error, a phase error, 
and a combined bit-flip and phase error. Using this terminology, a gen¬ 
eral environmental degradation of the state of a Qbit, which can always 
be put in the form (5.15), can be viewed as a superposition of no error 
(1), a bit-flip error (X), a combined bit-flip and phase error (Y), and 
a phase error (Z). The oversimplified example of Section 5.2 ignored 
the possibility of phase errors (Z) and combined errors (Y). 

If we were to extend this analysis to the corruption of an n-Qbit 
codeword | 'P) w , we would end up with a combined state of the codeword 
and the environment of the form 


3 3 

k>l*>-> E'Tl^.> xW8 '"®X W in (5-16) 

ll 1=0 jl n =0 

where 

X (0) = 1, X (1) = X, X (2) = Y, X (3) = Z. (5.17) 

The construction of error-correcting codewords requires a physical 
assumption analogous to the assumption in Section 5.2 that at most a 


1 This Y differs by a factor of i from the Y briefly used in Section 1.4. 
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single Qbit in a codeword suffers a flip error. If | ) is a state of a small 

number n of Qbits that make up an n-Qbit codeword, then the proba¬ 
bility of corruption of the codeword is so small that the terms in (5.16) 
differing from the term 1 ® ® 1 that reproduces the uncorrupted 

state are dominated by those in which only a single one of the 
differs from 1. If this condition is met, then the general form (5.16) of 
a corrupted n-Qbit codeword is a superposition of terms in which each 
individual Qbit making up the word has suffered a degradation of the 
form (5.15): 

a,)X, + \b,)Y, + k,)Z,Y'I/). (5.18) 

One can allow for the more general possibility of two or more Qbits 
in a codeword being corrupted together if one is willing to use longer 
codewords to correct such errors. The examples of error correction 
given below are all at the level of single-Qbit errors of the form (5.18) 
in the codeword. The extent to which the dominant sources of error 
will actually be of this form may well depend on the kind of physical 
system used to realize the Qbits. Eventually the theory of quantum 
error correction will have to face this issue. Meanwhile this possible 
future source of difficulty should not distract you from appreciating 
how remarkable it is that an error-correction procedure exists at all, 
even in the restricted setting of single-Qbit errors. 

To correct 1-Qbit errors we require a procedure that restores a 
corrupted state of the form 

n —l 

(k>x,-hP) + imy.-i*) + \c,)z,m) (5.i9) 

1=0 

to the uncorrupted form 



k>l*>, (5-20) 

where \e) is the environmental state accompanying whichever of the 
3n + 1 terms in (5.19) our error-correction procedure has projected 
the corrupted state onto. If the term in X* were the only one present 
in (5.19), we could use a 3-Qbit codeword and achieve this projec¬ 
tion by applying precisely the error-correction technique described in 
Section 5.2. But to deal with the additional possibilities associated with 
the terms in Y* and Z l we require longer codewords and more elaborate 
diagnostic methods. 


5.4 Diagnosing error syndromes 

Before turning to specific quantum error-correcting codes, it is useful to 
anticipate the general structure of the gates we will be using to identify 
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and project onto a particular term in the general 1-Qbit corruption 
(5.19) of a codeword. As noted earlier, these will be generalizations 
of the controlled l~iL\ and Z 1 Z 0 gates used to diagnose errors in the 
artificial case in which only bit-flip errors are allowed. 

Let A be any w-Qbit Hermitian operator whose square is the unit 
operator: 


A 2 = 1. (5.21) 

It follows from (5.21) that A is unitary, since A = A. The eigenvalues 
of A can only be 1 or —1, since A acting twice on an eigenstate must 
act as the identity 1. The projection operators onto the subspaces of 
states with eigenvalue +1 and — 1 are, respectively, 

a 1 + A a 1 — A 

P 0 = —j— and p i = 2 ■ (5-22) 

Since P 0 + Pi = 1, any state | ifr) can be expressed as a superposition 
of its projections onto these two subspaces: \ijf) = Pol VO + PilVO- 

The operators Z 2 Z 1 and Z 1 Z 0 encountered in the 3-Qbit code for 
correcting bit-flip errors are examples of such A. In the more general 
cases we shall be examining, the operators A will be more general 
products of both Z and X operators associated with different Qbits in 
the codeword; for example A = Z 4 X 3 Z 2 X 1 X 0 . 

In addition to the n Qbits on which A acts, we introduce an ancillary 
Qbit and consider the controlled operator C \ which we write here 
in the alternative form cA to avoid having subscripts on superscripts, 
which acts as A on the n Qbits when the state of the ancilla is 11) and 
as the identity when the state of the ancilla is |0). If the state of the 
ancilla is a superposition of | 0 ) and 11 ), the action of cA is defined by 
linearity. When A is a product of 1-Qbit operators, the operator cA 
can be taken to be a product of ordinary 2-Qbit controlled operators. If 
A = Z 4 X 3 Z 2 X 1 X 0 , then cA would be CZ 4 CX 3 CZ 2 CX 1 CX 0 , where each 
of the five terms has a different target Qbit, but all are controlled by 
one and the same ancilla. 

If the ancilla is initially in the state |0) and one applies a Hadamard 
transform H to the ancilla both before and after applying cA to the 
n + 1 Qbits and the initial state of the n Qbits is |*P), then the n Qbits 
will end up entangled with the ancilla in the state 

(H ® l)cA(H ® 1)|0>|'P) = (H ® l)cA-^(|0) + \l))\V) 

= (H®l)^(|0>|^) + |l>Aht>) 

= |(|0> + |1>)|'I'> + |(|0)-|1>)A|'I'> 

= |0>|(1 + A)|vI/> + |1>I(1 — A)|vI/> 

= |0>P^|vI/> + |l>P^|xl/>. (5.23) 
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|o> 







If we now measure the ancilla, then according to the generalized Born 
rule, if the measurement gate indicates 0 or 1 , then the state of the n 
Qbits becomes the (renormalized) projection of |tp) into the subspace 
of positive (eigenvalue + 1 ) or negative (eigenvalue — 1 ) eigenstates of 
A. This is illustrated for the case A = Z 4 X 3 Z 2 X 1 X 0 in Figure 5.6. 

This procedure is called measuring A or a measurement of A. The 
terminology reflects the fact that it is a generalization of the ordinary 
process of measuring a single Qbit, to which it reduces when n — 1 and 
A = Z. In that case the subspaces spanned by the positive and negative 
eigenstates of Z are just the one-dimensional subspaces spanned by |0) 
and 11 ), and the probabilities of the two outcomes, as one can easily 
check, are indeed given by the Born rule. 

In error correction one needs several such Hermitian operators, 
each squaring to unity, all acting on the same n Qbits. For concreteness 
consider the case of three such operators, A, B, and C. Introduce an 
ancillary Qbit for each operator, labeling the ancillas 0, 1, and 2, and 
introduce controlled operators cA, cB, and cC, each controlled by the 
corresponding ancilla. Now apply Hadamards to each of the ancillas 
(each initially taken to be in the state | 0 )), both before and after the 
product of all the controlled operators acts. The result (see Figure 5.7) 
is the obvious generalization of (5.23), taking |0)|0}|0)|fl / ) into 


Fig 5.6 


The way in which 
measurement gates are 
employed in quantum 
error correction. The 
ancilla (upper wire) is 
initially in the state zero. 
The remaining five Qbits 
are initially in the state 
1 ifr). If the measurement 
gate acting on the ancilla 
gives the result x (0 or 1 ) 
then the final state of the 
five Qbits will be the 
(renormalized) projection 
P x \^) of the initial state 
into the subspace spanned 
by the eigenstates of 
Z4X3Z2X1X0 with 
eigenvalue (— 1 )* 




(H 2 H! Ho) (cCcBcA) (HzH! H 0 ) |0> |0> |0) | V) 


1 1 1 

= 52 i x 2>i x i}i' r o> 

xi _ = 0 #i=0 xo=0 


1 + (-iyc \ 


X 


1 + (-l) J »A 



1 1 1 

= E E E i x 2>ki>ko>p^pf 1 p^i'p>. 

X2=0 X] =0 Xo=0 


/ I + (-l)*'B \ 


(5.24) 
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Fig 5.7 


_A, B, and C are 

commuting operators 
satisfying 

A 2 = B 2 = C 2 = 1. They 
act on the w-Qbit state |T) 
associated with the thick 
lower wire. The effect of 
measuring the three 
ancillas (top three wires) is 
to project the state of the n 
Qbits associated with the 
lower wire into its 
component in one of the 
eight eigenspaces of A, B, 
and C. If the results of 
measuring the control bits 
associated with A, B, and 
C are #o, *T, and xi then 
the projection is into the 
eigenspace with 
eigenvalues (—l) t0 , (— l) Vl , 
and (— \) X2 . Such a process 
is called “measuring A, B, 
and C.” When n = 3 and 
A, B, and C are three 
different 1-Qbit Z 
operators, the process is 
equivalent to an ordinary 
measurement of the three 
Qbits on which the three Z 
operators act. 



If A, B, and C all commute - which is always the case in the examples 
relevant to error correction - then the state 


p^pf p^i't) = 

x 2 x \ A 0 


1 + (-l) X2 C\ (l + (-l)*iB\ (l + (-lf°A 


(5.25) 

is an eigenstate of all the operators C, B, and A, with respective 
eigenvalues 

(-lVM-l)* 1 , and (—l) x °. (5.26) 

This follows directly from the fact that if V 2 = 1 then 


V 


1 + (-lfV 


= (-i r 


1 + (-lfV 


(5.27) 


So measurement of the three ancillas projects the n Qbits into one of 
the eight simultaneous eigenspaces of the three commuting operators 
C, B, and A, and the outcome X 2 X\x$ of the measurement determines 
which eigenspace it is. This process is described as a joint measurement 
of C, B, and A. 

Note that if A, B, and C are 1-Qbit operators Z M Z ; , and Z^ that 
act on the zth, j th, and kxh of the n Qbits, then this process reduces to 
the ordinary measurement of those three Qbits, since ^(1 + (— \) X Z) 
projects onto the 1-Qbit state \x). The two equivalent error-correction 
circuits in Figures 5.3 and 5.5 are measurements, in this generalized 
sense, of the two commuting operators A = Z 2 Z 1 and B = ZiZq. 

The form (5.18) of a general 1-Qbit error on an n-Qbit codeword re¬ 
veals that to correct errors it is necessary to make a measurement, in this 
more general sense of the term, that projects a possibly corrupted code¬ 
word into an identifiable one of 1 + 3 n orthogonal two-dimensional 
spaces: one two-dimensional subspace for the uncorrupted codeword 
|\k), and 3 n additional two-dimensional subspaces for each of the 1- 
Obit error terms X*|\k), Y,-|'F), and Z* |\k), i = 0, ..n — 1, in (5.18). 
Thus the 2 n -dimensional space spanned by all the states of the n Qbits 
must be large enough to contain 1 + 3 n orthogonal two-dimensional 
subspaces, giving us the condition 


2 n_1 > 3 k + 1 


(5.28) 


on an w-Qbit code capable of correcting a general 1-Qbit error. The 
lowest n satisfying this condition is n = 5, for which it holds as an 
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equality. Remarkably, there is indeed a 5-Qbit code for which this can 
be done. This is reminiscent of the situation in Section 5.2, where it 
was necessary only to discriminate between the uncorrupted codeword 
l^k) and the n NOT-corruptions X, | T). There the number of Qbits 
had to satisfy (5.5), which is first satisfied (again as an equality) when 
n = 3. 

The 5-Qbit code is the most compact and elegant of the quantum 
error-correcting codes, but it suffers from the fact that it is difficult 
to construct the appropriate generalizations of 1- and 2-Qbit gates 
between codewords. I therefore go on to describe a second, 7-Qbit code, 
which overcomes this problem. The first quantum error-correcting 
code, discovered by Peter Shor, which uses a 9-Qbit generalization of 
the 3-Qbit code of Section 5.2, is now of solely historical interest. It is 
described in Appendix N. 


5.5 The 5-Qbit error-correcting code 

The two 5-Qbit code words |0) and 11) are most conveniently defined in 
terms of the very operators, described in general terms in Section 5.4, 
that will be used to diagnose the error syndrome. So we begin by 
specifying those operators. 

To distinguish 1 + (3 x 5) = 16 mutually orthogonal two- 
dimensional subspaces we require four such mutually commuting 
Hermitian operators that square to unity, since each can independently 
have two eigenvalues (d=l) and 2 4 = 16. These operators are defined 
as follows: 


M 0 = ZiX 2 X 3 Z 4 , 
Mi = Z 2 X 3 X 4 Zo, 

m 2 = z 3 x 4 x 0 z 1? 

m 3 = Z 4 X 0 XiZ 2 . 


(5.29) 


Each of the M* squares to unity because each is a product of commut¬ 
ing operators that square to unity. To check that the M* are mutually 
commuting, note that all the individual X, and Zj operators commute 
with one another except for an X* and Z t with the same index, which 
anticommute: X l Z l = —Z l X l . But in converting the product of any 
two different M* to the product in the reverse order by reversing the 
orders of the individual X* and Z t operators that make them up, one 
always encounters exactly two interchanges that result in a minus sign. 

One might be tempted to break the irritating asymmetry of (5.30) 
by adding to the list 


m 4 = Z 0 XiX 2 Z 3 , 


(5.30) 
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but it is not independent of the other four. Every X* and Z* appears 
exactly twice in the product of all five M*, so the product must be either 
1 or — 1. One easily checks that 

M 0 M 1 M 2 M 3 M 4 = 1, (5.31) 

and therefore 

M 4 = M 0 MiM 2 M 3 . (5.32) 

The 5-Qbit codewords are most clearly and usefully defined in terms 
of the M i (rather than writing out their lengthy explicit expansions in 
computational-basis states): 

|0) = kl + M 0 )(l + M0(1 + M 2 )(l + M 3 )|00000>, 

(5.33) 

|1) = 1(1 + M 0 )(l + Mj)(l + M 2 )(l + M 3 )|lllll>. 

Before examining how one might produce five Qbits in either of these 
states, we discuss how the states work to correct 1-Qbit errors. 

Since each M flips two Qbits, |0) is a superposition of computational- 
basis states with an odd number of zeros (and an even number of ones), 
while 11 ) is a superposition of states with an odd number of ones (and 
an even number of zeros). Consequently the two codeword states are 
orthogonal. They are also normalized to unity. Since = 1, 

(1 + M ,-) 2 = 2(1 + M f ). (5.34) 


So we have 

<0|0> = (000001(1 + M 0 )(l + Mi)(l + M 2 )(l + M 3 )|00000), 

(5.35) 

<1|1> = (111111(1 + M 0 )(l + Mi)(l + M 2 )(l + M 3 )|lllll). 

If we expand the products of 1 + M, into 16 terms, the term 1 con¬ 
tributes 1 to (0|0) and to (1|1). Each of the remaining 15 terms can 
be reduced, using (5.31) (and the fact that each = 1), to either a 
single M* or a product of two (i = 0, ..., 4). So each of the 15 terms 
flips either two or four Qbits and contributes 0 to the inner products. 
Because the all commute and because 

M l -(1 + M f -) = 1 + M f -, (5.36) 

the states |0), 11), and their superpositions 

|vp) = a \0)+p\T) (5.37) 

are all eigenstates of each of the M* with eigenvalue 1. 

The 15 possible corruptions of (5.37) appearing in the corrupted 
state (5.18) are also eigenstates of the M m distinguished by the 
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Table 5.2. The four error-syndrome operators M* for the 5-Qbit 
code, and whether each of them commutes (+) or anticommutes (—) 
with each of the 15 operators X M Y M and Z M i = 1, ..5, associated 
with the 15 different terms in the corrupted codeword. Note that 
each of the 15 columns, and the 16th column associated with 1 (no 
error), has a unique pattern of + and — signs. 




X 0 Y 0 Zo 

XlYlZl 

x 2 Y 2 z 2 

X3Y3Z3 

X4Y4Z4 

1 

Mo 

= Z 1 X 2 X 3 Z 4 

+ + + 

- - + 

+ - - 

+ - - 

- - + 

+ 

IVf 

= Z 2 X 3 X 4 Zo 

- - + 

+ + + 

- - + 

+ - - 

+ - - 

+ 

m 2 

= Z 3 X 4 X 0 Zi 

+ - - 

- - + 

+ + + 

- - + 

+ - - 

+ 

m 3 

= Z 4 X 0 X 1 Z 2 

+ - - 

+ - - 

- - + 

+ + + 

- - + 

+ 


15 = 2 4 — 1 other possible sets of eigenvalues ±1 that the four M* 
(i = 0, ..., 3) can have. To see this, note first that each X n Y ? , and 
Z i commutes or anticommutes with all four M,. Therefore each of 
the terms XJ^), Y ? |^), and Z^ | T) appearing in (5.18) is indeed an 
eigenstate of each IVf with eigenvalue 1 or —1. 

Table 5.2 indicates whether each M* commutes (+) or anticommutes 
(—) with each of the X/, Y ? , Z ? , and (trivially) the unit operator 1 . In¬ 
spection of the table reveals that each of the 16 possible binary columns 
of four symbols (+ or —) appears in exactly one column. Therefore, 
when the four IVf are measured, the corrupted state (5.18) is projected 
back to its original form if all four eigenvalues are +1, or projected onto 
one of the 15 corrupted states Xo|^), ..Z 4 1 ^) depending on which 
column in the table describes the eigenvalues. In each corrupted case 
the original state can be restored by application of the corresponding 
unitary transformation X M —Y* = X* Z*, or Z t to the appropriate Qbit. 
A circuit that measures the four operators (5.29) is shown in 
Figure 5.8. 

The perfect efficiency of the 5-Qbit code leads to a straightfor¬ 
ward way to manufacture the two 5-Qbit codeword states (5.33). 
As noted above, the 16 distinct sets of eigenvalues for the four 
mutually commuting operators M* decompose the 32-dimensional 
space of five Qbits into 16 mutually orthogonal two-dimensional sub¬ 
spaces, spanned by |0) and |1) and by each of their 15 pairs of 1-Qbit 
corruptions. 

The two-fold degeneracy of the four M* within each of these 16 
subspaces is lifted by the operator 

Z = ZoZiZ 2 Z 3 Z 4 , (5.38) 

which commutes with all the M ; . Since 100000) and 111111) are eigen¬ 
states of Z with eigenvalues 1 and —1, and since Z commutes with Z M 
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Fig 5.8 


A circuit to 
measure the error 
syndrome for the 5-Qbit 
code. The five Qbits are the 
five lower wires. The four 
upper wires are the ancillas 
to be measured in the 
manner of Figure 5.7, 
associated with measuring 
the four commuting 
operators Z 1 X 2 X 3 Z 4 , 
Z 2 X 3 X 4 Z 0 , Z 3 X 4 X 0 Z 1 , 
and Z 4 X 0 X 1 Z 2 of (5.29). 
When controlled-Z gates 
are present together with 
controlled-NOT gates, the 
figure is more readable if 
the cNOT gates are 
represented as 
controlled-AT gates. 



while anticommuting with X* and Y t , it follows that 


_ Z|0) = |0),_ 

zz t \0) = Zi\0) L 
ZX.-I0) = -X,-|0>, 
ZY/|0) = —Y,-|0), 


Z|l) = -|l),_ 
ZZ l -|l> = -Z i |l), 

ZX,-|1) =x,-|l), 
ZY f -|l) = Y f -|l>- 


(5.39) 


Consequently if one takes five Qbits in any state you like (perhaps 
most conveniently 100000)) and measures the four M* together with Z, 
one projects the Qbits into one of the 32 states 

|0>, x f -| 0 >, y,-| 0 >, z f -| 0 >, |T>, x,-|T), y.jT), z f -|T>, (5.40) 

and learns from the results of the measurement which it is. Just as in the 
error-correction procedure, if the state is not | 0 ) or 11 ) we can restore 
it to either of these forms by applying the appropriate X M Y M or Z l . If 
we wish to initialize the five Qbits to |0) we can apply X, where 

X = XoX 1 X 2 X 3 X 4 , (5.41) 

should the measurement indicate that the error-corrected state is | 1 ). 
This process of using a generalized measurement to produce five Qbits 
in the state | 0 ) is analogous to the procedure of using an ordinary 
measurement to produce a single Qbit in the state |0) described in 
Section 1.10. 

There is quite a different way to construct the 5-Qbit codewords, by 
applying a set of 1- and 2-Qbit unitary gates to an uncoded 1-Qbit state 
and four ancillary Qbits all initially in the state |0). This is described 
in Section 5.9. 
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Fig 5.9 


A circuit to 
measure the error 
syndrome for the 7-Qbit 
code. The seven Qbits are 
the seven lower wires. The 
six upper wires are the 
ancillas to be measured, 
resulting in a measurement 
of the six commuting 
operators Z 0 Z 4 Z 5 Z 6 , 

ZiZ 3 Z 5 Z 6 , z 2 z 3 z 4 z 6 , 
x 0 x 4 x 5 x 6 , X!X 3 X 5 X 6 , 
and X 2 X 3 X 4 X 6 of (5.42). 


5.6 The 7-Qbit error-correcting code 

The 5-Qbit code is theoretically ideal but suffers from the problem 
that circuits performing extensions of many of the basic 1 - and 2 - 
Qbit operations to the 5-Qbit codewords are cumbersome. The current 
favorite is a 7-Qbit code, devised by Andrew Steane, which permits 
implementations of many basic operations on codewords, which are not 
only quite simple but also themselves susceptible to error correction. 

The Steane code uses six mutually commuting operators to diagnose 
the error syndrome: 

M 0 = X 0 X 4 X 5 X 6 , N 0 = Z 0 Z 4 Z 5 Z 6 , 

Mi = XiX 3 X 5 X 6 , Ni = Z!Z 3 Z 5 Z 6 , (5.42) 

M 2 = X 2 X 3 X 4 X 6 , N 2 = Z 2 Z 3 Z 4 Z 6 . 

The six operators in (5.42) clearly square to give the unit operator. The 
M i trivially commute among themselves as do the N,, and each M* 
commutes with each N ; , in spite of the anticommutation of each X£ 
with the corresponding Z&, because in every case they share an even 
number of such pairs. A circuit that measures the six operators (5.42) 
is shown in Figure 5.9. 
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The 7-Qbit codewords are defined by 

|0) = 2“ 3/2 (l + Mo)(l + M0(1 + M 2 )|0) 7 , 
|T> = 2“ 3/2 (l + Mo)(l + M,)(l + M 2 )X|0>7, 

where 


(5.43) 


X = XoX 1 X 2 X 3 X 4 X 5 X 6 , (5.44) 

so that 

11111111) = X|0000000). (5.45) 

We again defer our discussion of how to produce these states until after 
our discussion of how they are used in error correction. 

The two states in (5.43) are orthogonal, since each M flips four 
Qbits while X flips all seven of them, so the first state is a superposition 
of 7-Qbit states with an odd number of zeros while the second is a 
superposition with an even number of zeros. They are normalized to 
unity, for essentially the same reasons as in the case of 5-Qbit code. 

Since X commutes with all the IVf, a general superposition of the 
two codewords can be written as 


I'l') = a|0) +£|1> = (al + 0X)|O> 
and its corruption (5.18) assumes the form 


(5.46) 


k)l*>-» l\d)l + j2[\a,)X l + \b l )Y l + k,)Z ( ]W>. (5.47) 


Z =1 


Because the M* all commute and M ? (l + IVf) = 1 + M m and be¬ 
cause the N ; commute with the the M* and with X and have 10000000) 
as an eigenstate with eigenvalue 1, it follows that |0), 11), and the gen¬ 
eral superposition (5.46) are eigenstates of each of the M ? and l\f with 
eigenvalue 1. The 21 possible corruptions of (5.46) appearing in (5.47) 
are also eigenstates, distinguished by the possible sets of eigenvalues 
d=l that the three M* and three N* can have. As in the 5-Qbit case, this 
is because each X ? , Y*, and Z t commutes or anticommutes with each of 
the M* and N ? , so each state appearing in (5.47) is indeed an eigenstate 
of each M* and l\f with eigenvalue 1 or —1. 

To see why the results of the six measurements of the M* and l\f 
determine a unique one of the 22 terms in (5.47), examine Table 5.3, 
which indicates by a bullet (•) whether an X* appears in each of the IVf 
and whether a Z t appears in each of the l\f. Each IVf commutes with 
every X 7 ; it anticommutes with Y 7 and Z j if a bullet appears in the 
column associated with X 7 and commutes if there is no bullet; each l\f 
commutes with every Z 7 ; it anticommutes with X 7 and Y 7 if a bullet 
appears in the column associated with Z 7 and commutes if there is no 
bullet. 
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Table 5.3. The six error-syndrome operators M* and l\f, i = 0, 1, 2, 
for the 7-Qbit code. A bullet (•) indicates whether a given X* appears 
in each M* and whether a given Z* appears in each l\f. 



Xo 

Xi 

X 2 

X 

(JU 

x 4 

X 

U1 

x 6 

M 0 

• 




• 

• 

• 

M-, 


• 


• 


• 

• 

m 2 



• 

• 

• 


• 


O 

N 

Zt 

Z 2 

N 

Z 4 

N 

Ln 

Z 6 

No 

• 




• 

• 

• 

l\h 


• 


• 


• 

• 

n 2 



• 

• 

• 


• 


The signature of an X* error (or no error) is that all three M * measure¬ 
ments give +1. The pattern of — 1 eigenvalues in the l\f measurements 
then determines which of the seven possible X* characterize the error. 
(If all three l\f measurements also give +1 there is no error.) 

In the same way, the signature of a Zerror (or no error) is that all 
three l\f measurements give +1 and then the pattern of — 1 eigenvalues 
in the measurements determines which of the seven possible Z* 
characterize the error. 

Finally, the signature of a Y* error is that at least some of both the 
M* and the l\f measurements give —1. The resulting pattern of — 1 
eigenvalues (which will be the same for both the M* and the N, mea¬ 
surements) then determines which of the seven possible Y / characterize 
the error. 

So the six measurements project the corrupted state into a unique 
one of the 22 terms in (5.47) and establish which term it is. One can 
then undo the corruption by applying the appropriate one of the 22 
operators 1 , Xo, .. 7.^. 

To produce the 7-Qbit codewords one cannot immediately extend 
the method we used in Section 5.5 to produce the 5-Qbit codewords, 
because the two 7-Qbit codewords and their 21 1-Qbit corruptions 
constitute only 44 mutually orthogonal states, while the space of seven 
Qbits has dimension 2 7 = 128. One can, however, provide the missing 
84 dimensions by noting the following. 

The 2 x 7 x 6 = 84 states given by 

X f -Zy|0) and X,-Zy|T>, i^j, (5.48) 

are also easily verified to be eigenstates of all the M t and l\f. These 
states can be associated with some of the possible 2-Qbit errors, but 
this is not pertinent to the use to which we put them here. Like the 
1-Qbit Y i errors, these states result in at least some of both the M* and 
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the l\f measurements giving — 1 , but unlike the Y t errors, the resulting 
pattern of — 1 eigenvalues will not be the same for both the M* and the 
Nj measurements, since i ^ j . Each of the 7 x 6 = 42 possibilities for 
X/Z 7 leads to its own characteristic pattern of + 1 and —1 eigenvalues. 

This gets us back to the situation we encountered in the 5-Qbit case. 
By measuring the seven mutually commuting operators M l\f , and 

Z = Z 0 Z l Z 2 Z 3 Z 4 Z 5 Z 6 , (5.49) 

we can produce from seven Qbits in an arbitrarily chosen state a unique 
one of the 128 mutually orthogonal states given by |0), 11), their 42 
different 1-Qbit corruptions, and their 84 different special kinds of 
2-Qbit corruptions. The results of the measurement tell us the char¬ 
acter (if any) of the corruption, from which we know what operators 
(X„Y„ Z i , or Xj Z j , possibly combined with X) we must apply to the 
post-measurement state to convert it into | 0 ). 

A simpler way to produce 7-Qbit codewords is to start with seven 
Qbits in the standard initial state 1 0 ) 7 , and then measure Mo, Mi, and 
M 2 . The resulting state will be one of the eight states 

2“ 3/2 (l ± M 0 )(l ± Mi)(l ± M 2 )|0>7, (5.50) 

with the specific pattern of + and — signs being revealed by the mea¬ 
surement. The upper part of Table 5.3 now permits one to choose a 
unique Z l that commutes or anticommutes with each M ? depending on 
whether it appears in (5.50) with a + or a — sign. Since Z l | 0>7 = 1 0 ) 7 , 
acting on the seven Qbits with that particular Z, converts their state to 

2“ 3/2 (l + M 0 )(l + Mj)(l + M 2 )|0 } 7 = |0>. (5.51) 

In Section 5.8 we examine a surprisingly simple circuit that encodes 
a general 1-Qbit state into a 7-Qbit codeword state in the manner of 
Figure 5.1, without using any measurement gates. 


5.7 Operations on 7-Qbit codewords 

The virtue of the 7-Qbit code, that makes it preferable to the 5-Qbit 
code in spite of its greater expenditure of Qbits, is that many of the 
fundamental 1- and 2-Qbit gates are trivially extended to 7- and 14- 
Qbit gates acting on the codewords. Because, for example, X commutes 
with the M ; and flips all seven Qbits, it implements the logical NOT 
on the codewords (5.43): 

X|0> = |T>, X|T> = |0>. (5.52) 

Similarly, Z commutes with the M ? , anticommutes with X, and leaves 
|0)7 invariant, so it implements the logical Z on the codewords: 


Z|0) = |0>, 


Z|l> = -|l>. 


(5.53) 
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This much works equally well for the 5-Qbit code. More remarkably, 
for the 7-Qbit code the bitwise Hadamard transformation, 

H = H 0 HiH 2 H 3 H 4 H 5 H 6 , (5.54) 

also implements the logical Hadamard transformation on the code¬ 
words: 


H l°) = 7 f (l°> + l 1 >), H|1> = ^(|0>-|1>). (5.55) 

(This does not hold for the 5-Qbit code.) 

To see this, note first that two normalized states |0) and \i/f) are 
identical if and only if their inner product is 1. (For one can always 
express |0) in the form |0} = a\4>) + /3|x), where |x) is orthogonal 
to |0) and \a\ 2 + |/3 | 2 = 1. We then have (0|0) = a, so if (0|0) = 1, 
then a = land/3 = 0.) Since |0) and |1) are normalized and orthogonal 
and since H is unitary and therefore preserves the normalization of |0) 
and |1), the four states appearing in the two equalities in (5.55) are all 
normalized. Therefore, to establish those equalities it suffices to show 
that 

1 = ^(<0|H|0> + (0|H|I>), 1 = ^«I|H|0> - (I|H|I». 

(5.56) 

This in turn would follow if we could show that the matrix of the 
encoded Hadamard in the encoded states is the same as the matrix of 
the 1-Qbit Hadamard in the 1-Qbit states: 

<0|H|0> = <0|H|I) = <I|H|0> = j= 2 , (I|H|I> = (5.57) 

To establish (5.57), note that it follows from the definition (5.43) of 
the codewords | 0 ) and | 1 ) that the four matrix elements appearing in 
(5.57) are 

<x|H| y) = 2 -3 7 (0|X’ V (1 + M 0 )(l + Mi)(l + M 2 )H(1 + M 0 ) 

x(1 + Mi)(1 + M 2 )X j '|0) 7 . (5.58) 

Since HX = ZH and XH = HZ, and since each l\f differs from M* 
only by the replacement of each X by the corresponding Z, it follows 
that 


H M ; = l\f H, M,-H = HN,-. (5.59) 

So we can bring all three terms 1 + IVf in (5.58) on the right of H over 
to the left if we replace each by 1 + l\f. But since the Ms and Ns all 
commute we can then bring all three terms 1 + M* on the left of H 
over to the right if we again replace each by 1 + N*. The effect of these 
interchanges is simply to change all the Ms in (5.58) into Ns: 

<x|H|j7> = 2 -3 7 <0|X'(1 + N 0 )(l + Ni)(l + N 2 )H(1 + N 0 ) 

x(1 + N 1 )(1 + N 2 )X j '|0) 7 . (5.60) 
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Since each N* commutes with X (there are four anticommutations) 
we have 

<x|H|j7) = 2" 3 7 <0|(1 + N 0 )(l + Ni)(l + N 2 )X' HX (1 + N„) 

x(1 + Ni)(1 + N 2 )|0) 7 , (5.61) 

but since each l\f acts as the identity on | 0 > 7 , each of the six 1 + l\f can 
be replaced by a factor of 2, reducing (5.61) simply to 

(x|H|37> = 2 3 7 (0|X t HX j |0} 7 . (5.62) 

Since X, H, and 1 0)7 are tensor products of the seven 1-Qbit quantities 
X, H, and |0), (5.62) is just 

(x|H[y) = 2 3 (x|H|j/) 7 . (5.63) 

But since 

(0|H|0) = (0|H|1) = (1|H|0) = ^, (1|H|1) = -^, (5.64) 

(5.63) does indeed reduce to (5.57), establishing that H = H 07 does 
indeed act as a logical Hadamard gate on the codewords. 

Nor is it difficult to make a 14-Qbit logical cNOT gate that takes the 
pair of codewords \x) \y) into \x)\x © y). One simply applies ordinary 
cNOT gates to each of the seven pairs of corresponding Qbits in the 
two codewords. This works because each of the codewords in (5.43) is 
left invariant by each of the M ; . If the control codeword is in the state 
| 0 ) then the pattern of flips applied to the target codeword for each of 
the eight terms in the expansion of the control codeword 

|0) = 2“ 3/2 (l + Mo + Ml + M 2 + MjM 2 + M 2 M 0 

+ M 0 Mi + M 0 M!M 2 )|0) 7 (5.65) 

is simply given by the corresponding product of M* . Since each M* acts 
as the identity on both | 0 ) and 11 ), the target codeword is unchanged. 
On the other hand, if the control codeword is in the state |1) then 
the pattern of flips applied to the target codeword differs from this 
by an additional application of X, which has precisely the effect of 
interchanging | 0 ) and | 1 ). 

Because of the simplicity of all these encoded gates, one can use 
error correction to eliminate malfunctions of the elementary gates 
themselves, if the rate of malfunctioning is so low that only a sin¬ 
gle one of the seven elementary gates is likely to malfunction. In the 
case of the 1-Qbit encoded gates, their elementary components act only 
on single Qbits in the codeword, so if only a single one of them mal¬ 
functions then only a single Qbit in the codeword will be corrupted and 
the error-correction procedure described above will restore the correct 
output. But this works as well for the encoded cNOT gate, since if 
only a single one of the elementary 2-Qbit cNOT gates malfunctions, 




5.8 A 7-QBIT ENCODING CIRCUIT 


127 




Fig 5.10 


A 7-Qbit encoding circuit (a) that takes | ifr) = a|0) + /3 11) 
into the corresponding superposition of the two 7-Qbit codewords 
given in (5.43), | ^) = a | 0) + /3 | 1). The numbering of the Qbits from 
6 to 0 is made explicit to facilitate comparison with the form 
(5.42)-(5.44) of the codewords. 


this will affect only single Qbits in each of the two encoded 7-Qbit 
words, and the correct output will again be restored by applying error 
correction to both of the codewords. 

Another virtue of codeword gates that can be constructed as ten¬ 
sor products of uncoded gates is that they cannot (when functioning 
correctly) convert single-Qbit errors to multiple-Qbit errors, as more 
elaborate constructions of codeword gates might do. This highly de¬ 
sirable property is called fault tolerance. The great advantage of the 
7-Qbit code is that many of the most important logical gates can be 
implemented in a fault-tolerant way. 


5.8 A 7-Qbit encoding circuit 

The circuit in Figure 5.10 encodes a general 1-Qbit state into a 7-Qbit 
codeword without using any measurement gates, in a manner analogous 
to the way Figure 5.1 produces 3-Qbit codewords. 

Since the circuit is unitary and therefore linear, it is enough to show 
that it works when | \jf) = |0) and when | ifr) = |1). This follows from 
the fact that if the ( n + 1)-Qbit gate C u is a controlled n-Qbit unitary 
U,then 

C c/ (H|0» ® |d>)„ = C^(|0> + |i}) 0 |<D>„ 

= ^(1 + X®U)|0)®|<I>> B , (5.66) 

where the control Qbit is on the left. If this is applied to the 
three controlled triple-NOT gates in Figure 5.10 then, reading from 
left to right, the resulting operations are ( 1 /V 2)(1 + X 2 X 3 X 4 X 6 ) = 
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(1/V2)(1 + M 2 ), (1/V2)(1 + XiX 3 X 5 X 6 ) = (1/V2)(l + Mi), and 
( 1 / V 2)(1 + X 0 X 4 X 5 X 6 ) = (1/V2)(1 + M 0 ). 

When \ \jf) = |0) the controlled double-NOT on the left acts as the 
identity, so the circuit does indeed produce the codeword |0) in (5.43). 
When | \jf) = 11), the controlled double-NOT on the left acts as X 4 X 5 . 
The circuit after that action is exactly the same as when \i/f) = |0), 
except that the initial state of Qbits 3, 4, and 5 on the left is 11) rather 
than |0). Since all X ; commute, the state that results is not |0) but 
X 3 X 4 X 5 IO). But 

X 3 X 4 X 5 = XoXiX 2 X 3 X 4 X 5 X 6 MoMiM 2 = XM 0 M 1 M 2 . (5.67) 
Since M 0 MiM 2 acts as the identity on | 0 ), the resulting state is indeed 

|T> = x|0). 

A less direct method to confirm that Figure 5.10 produces the 7- 
Qbit encoding, analogous to the method described in Section 5.9 for 
the 5-Qbit encoding, is given in Appendix O. 


5.9 A 5-Qbit encoding circuit 

The circuit in Figure 5.11 encodes a general 1-Qbit state into a 5-Qbit 
codeword without using any measurement gates. 

The circuit differs from one reported by David DiVincenzo 2 only 
by the presence of the 1-Qbit gates ZHZ on the left. When | \//) = \x) 
DiVincenzo’s circuit produces two orthogonal linear combinations of 
the codewords (5.43), which are, of course, equally valid choices. But 
to get the codewords in (5.43) one needs these additional gates. (I have 
written them in the symmetric form ZHZ rather than in the simpler 
equivalent form YH both to spare the reader from having to remember 
that Y = ZX and not XZ, and also to spare her the confusion of having 
to reverse the order of gates when going from a circuit diagram to the 
corresponding equation.) 

In contrast to the superficially similar circuit for the 7-Qbit code in 
Figure 5.10, there does not seem to be a transparently simple way to 
demonstrate that the circuit in Figure 5.11 does produce the 5-Qbit 
codewords. One can always, of course, write down the action of each 
successive gate in the circuit, and check that the resulting unwieldy 
expressions are identical to the explicit expansions of the codewords 
(5.33) in computational-basis states. A less clumsy proof follows from 
the fact that | 0 ) is the unique (to within an overall phase factor e t(p ) 
joint eigenvector with all eigenvalues 1 of the five mutually commuting 
operators consisting of the four error-syndrome operators Mo, ...» M 3 


2 David P. DiVincenzo, “Quantum Gates and Circuits,” Proceedings of the 
Royal Society of London A 454, 261—276 (1998), 
http://arxiv.org/abs/quant-ph/ 9705009 . 
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| 0 > 

| 0 > 

| 0 > 

| 0 > 




Fig 5.11 


A 5-Qbit encoding circuit. If the initial state of the Qbit on 
the top wire is | \jr) = a| 0 ) + /3|1), then the circuit produces the 
corresponding superposition of the two 5-Qbit codewords given in 
(5.33), |*F) = a|0) + /3| 1). This fact is established in Figures 
5.12-5.20. The figure illustrates this for the states |0) and | 1 ) (x = 0 or 
1) on the upper wire. Since a product of unitary gates is linear, the 
circuit encodes arbitrary superpositions of these states. 


of Equation (5.29) and the operator Z of Equation (5.38). So if we can 
establish that the state \x) produced in Figure 5.11 is invariant under 
the four M ? , that it is invariant under Z when x = 0, and that applying 
the 5-Qbit X to \x) is the same as applying the 1-Qbit X to | v), then we 
will have shown that the circuit produces the 5-Qbit encoding to within 
an overall phase factor e t(p . Having done this, we can then confirm that 
e l(p = 1 by evaluating the projection on | 0)5 of the state produced by 
the circuit when a 1 = 0 . 

To learn the actions of various products of 1-Qbit Xs and Zs on the 
state produced by the circuit in Figure 5.11, we apply them on the right 
side of the diagram, and then bring them to the left through the cNOT 
gates and 1-Qbit gates that make up the circuit, until they act directly 
on the input state on the left. In doing this we must use the fact that 
bringing an X (or a Z) through a Hadamard converts it to a Z (or an 
X), bringing an X through a Z introduces a factor of — 1, and bringing 
an X or a Z through a cNOT has the results shown in Figure 5.12: a 
Z on the control Qbit (or an X on the target Qbit) commutes with a 
cNOT, while bringing a Z through the target Qbit (or an X through 
the control Qbit) introduces an additional Z on the control Qbit (or X 
on the target Qbit). 

Figure 5.13 uses these elementary facts to show that Mo = 
Z 1 X 2 X 3 Z 4 leaves both codewords invariant, by demonstrating that it 
can be brought to the left through all the gates in the circuit to act on 
the input state |v0000) as Z 2 . Figures 5.14-5.16 show similar things 
for Mi = Z2X3X4Z0, M 2 = Z3X4X0Z1, and M 3 = Z 4 X 0 XiZ 2 , which 
can be brought to the left through all the gates to act on the input state 
as Zo, Z3, and Zi. Figure 5.17 shows that X = X0X1X2X3X4 can be 
brought to the left through all the gates of the circuit to act on the 
input state | a 1 0000) as X4Z2Z1, which simply interchanges x = 0 and 
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(a) (b) 




Fig 5.12 


Easily verifiable identities useful in determining how various 
products of Xs and Zs act on the circuit of Figure 5.11. (a) A cNOT 
can be interchanged with an X acting on the control Qbit, if another X 
acting on the target Qbit is introduced, (b) A cNOT commutes with an 
X acting on the target Qbit. (c) A cNOT can be interchanged with a Z 
acting on the target Qbit, if another Z acting on the control Qbit is 
introduced, (d) A cNOT commutes with a Z acting on the control 
Qbit. 


a 1 = 1, thereby demonstrating that X acts as logical ATon the codewords. 
Figure 5.18 shows the analogous property forZ = Z 0 Z 1 Z 2 Z 3 Z 4 , which 
can be brought to the left through all the gates of the circuit to act on the 
input state | a 1 0000) as Z 4 Z 3 Z 0 , which multiplies it by (— l) r , thereby 
demonstrating that Z acts as logical Z on the codewords. Finally Fig¬ 
ures 5.19 and 5.20 show that the inner product of the codeword state |0) 
with the computational-basis state 100000 ) is j, thereby demonstrating 
that the circuit produces the codewords (5.33) with the right phase. 

In Appendix O this circuit-theoretic approach is used to give a sec¬ 
ond (more complicated, but instructive) demonstration of the validity 
of the 7-Qbit encoding circuit of Figure 5.10. 
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Fig 5.13 


Demonstration that Mq = Z 1 X 2 X 3 Z 4 acting on the output 
of the encoding circuit in Figure 5.11 is the same as Z 2 acting on the 
input, which leaves the input invariant. On the extreme left Mq is 
applied to the output of the circuit. The insets (a)-(g) show what 
happens as the X and Z gates making up Mq are moved to the left 
through the gates of the circuit, (a) Z 4 and X 3 are changed to X 4 and 
Z 3 as a result of having been brought through Hadamard gates, (b) 
Bringing the two X gates through the control Qbits of cNOT gates 
produces a pair of cancelling X gates on the common target Qbit, so 
the set of gates in (a) is unchanged when it is moved to (b). (c) The 
Hadamard gates convert X 4 and Z\ to Z 4 and Xi. (d) Bringing X 2 
through the control Qbit of the cNOT produces an X on its target 
Qbit which cancels the X already there, (e) The Hadamard on Qbit 2 
converts the X to a Z. (f) Moving the Z 2 through the targets of the two 
cNOTs produces Z gates on their control Qbits which cancel the two 
Z gates already there, (g) The resulting Z 2 can be moved all the way to 
the left. 
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Fig 5.14 


Using the identities in Figure 5.12 and the fact that bringing 
a Z through a Hadamard converts it to an X and vice versa establishes 
that M i can be brought to the left through the gates of the encoding 
circuit to act directly on |v 0000 ) as Zq. 
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Fig 5.15 


_ M 2 can be brought to the left through the gates of the 

encoding circuit to act directly on 1^0000) as Z 3 . 
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Fig 5.16 


_ M 3 can be brought to the left through the gates of the 

encoding circuit to act directly on |x0000) as Zi. 
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Demonstration that X = X 0 X 1 X 2 X 3 X 4 acting on the output 
of the encoding circuit in Figure 5.11 is the same as X 4 Z 2 Z 1 acting on 
the input, which interchanges 100000) and 110000). (a) Bringing X 4 
and X 3 through the Hadamards converts them to Z 4 and Z 3 . (b) 
Bringing X 2 through the cNOT controlled by Qbit 2 produces an X on 
the target Qbit 0, which cancels the X already there, (c) The 
Hadamards convert Z 4 and Xi to X 4 and Zi. (d) Bringing X 4 and X 2 to 
the left produces two Xi gates which cancel; bringing Z\ to the left 
then produces additional Z 4 and Z 2 gates, (e) The Hadamard H 2 
interchanges the X 2 and Z 2 gates, (f) First bring to the left the Z 2 gate, 
then the X 4 gate, (g) The H 4 converts ZXZ to XZX = — Z. (h) No 
further changes, (i) Z commutes with itself, is changed to X on passing 
through H, and acquires another minus sign on passage through Z. 


Fig 5.17 
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Fig 5.18 
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Demonstration that Z = Z 0 Z 1 Z 2 Z 3 Z 4 acting on the output 
of the encoding circuit in Figure 5.11 is the same as Z 4 Z 3 Z 0 acting on 
the input, which takes |xOOOO) into (—1) A |v0000). 
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Fig 5.19 


A circuit-theoretic way to evaluate inner products, (a) A 
circuit taking the input |d>) into the output |4Q = BA|<E>). The inner 
product (T|£ 2 ) of the output state T with some other state \ Q) is given 
by (0|A^B^|^). The diagram on the right in (b) shows this inner 
product being evaluated by first letting B' act on |£2), then letting A^ 
act on the result, and then taking the inner product with the input state 
|d>). Evidently this generalizes to the product of many gates. If the 
gates are all Hermitian, as they are in the circuit of Figure 5.11, then 
the circuit on the right of (b) is identical to the circuit on the left of (a). 
The resulting evaluation of the inner product of |£2) = 10) 5 with the 
state |0) produced by letting the circuit of Figure 5.11 act on 
|d>) = | 0)5 is carried out in Figure 5.20. 
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Fig 5.20 


Demonstration that the state produced by the encoding 
circuit in Figure 5.11 when x = 0 has an inner product with the state 
10)5 that is |, thereby establishing that the phase factor e l(p = 1 - i.e. 
that the state is precisely |0) without any additional phase factor, (a) 
Circuit-theoretic representation of the inner product, following the 
procedure developed in Figure 5.19; all gates now act to the right, (b) 
Elimination of operations in (a) that act as the identity: the cNOT on 
the extreme right of (a) can be dropped since its control Qbit is in the 
state |0); since H10) is invariant under X, the pair of cNOT gates 
targeting Qbit 1 can be dropped, as can the pair targeting Qbit 2. (c) A 
pair of Hadamards on Qbit 4 in (b) cancel; a Hadamard on Qbit 3 in (b) 
is moved to the left converting a cNOT to a controlled-Z; Qbits 2 and 
1 in (b) simply give the matrix element (0|H|0) = resulting in an 

overall factor of j. (d) Expanding both states H10) = -^(|0) + |1)) on 
the right of (c), the effect of the two cNOT gates in (c) is that only the 
terms in |0) |0) and 11) 11) give nonzero contributions, (e) The action of 
the controlled-Z gates in (d) has been carried out, leaving a sum of 
products of matrix elements of H. 



















































































































Chapter 6 


Protocols that use just a few Qbits 


6.1 Bell states 

In this chapter we examine some elementary quantum information- 
theoretic protocols which are often encountered in the context of quan¬ 
tum computation, though they also have applications in the broader 
area of quantum information processing. Because they use only a small 
number of Qbits, they have all been carried out in at least one laboratory, 
unlike any but the most trivial and atypical examples of the protocols 
we have considered in earlier chapters. 

Most of these examples make use of the 2-Qbit entangled state, 

IVW = ^(|00> + |11>). (6.1) 

This state can be assigned to two Qbits, each in the state 10), by applying 
a Hadamard to one of them, and then using it as the control Qbit for a 
cNOT that targets the other (Figure 6.1(a)): 

I^Aoo> = C 10 H 1 IOO). (6.2) 

We generalize (6.2) by letting the original pair of unentangled Qbits 
be in any of the four 2-Qbit computational-basis states |00), |01), |10), 
and 111) (Figure 6.1(b)): 


Wxy) = CioHilxj/}. (6.3) 

Since the four states \xy) are an orthonormal set and the Hadamard 
and cNOT gates are unitary, the four entangled states | VCty) are also an 
orthonormal set, called the Bell basis to honor the memory of the physi¬ 
cist John S. Bell, who discovered in 1964 one of the most extraordinary 
facts about 2-Qbit entangled states. We examine a powerful 3-Qbit 
version of Bell’s theorem in Section 6.6. 

If we rewrite (6.3) as 


IlM = Cio^XpgOO), (6.4) 

and recall that HX = ZH and that either a Z on the control Qbit or an 
X on the target Qbit commutes with a cNOT, then we have 

IlM = ZP^CjoHjIOO) = ZfX^OOO) + 111)), (6.5) 
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|o> 





(a) 


(b) 


Fig 6.1 


(a) A circuit that creates the entangled state 
l*»> = Ti (|00) + 111)) from the unentangled computational-basis 
state 100). (b) A circuit that creates the four orthonormal entangled 
Bell states |i jr xy ) from the unentangled computational-basis state | xy) 



|o> - xMH 




Fig 6.2 


The Bell states \^/ xy ) can be constructed from 
l*»> = ^ (|00) + 111)) by flipping a single Qbit, changing the sign 
from + to —, or doing both of these. 


as illustrated in Figure 6.2. This shows that the other Bell states are 
obtained from (1/V2)(|00) + 1 11 )) by flipping one of the Qbits, by 
changing the + to a —, or by doing both. This, of course, can also be 
derived directly from (6.3) by letting the Hadamard and cNOT act for 
each of the four choices for the pair xy. 

We now examine a few simple protocols in which some or all of 
the Bell states (or, in Section 6.6, their 3-Qbit generalizations) play an 
important role. 


6.2 Quantum cryptography 

A decade before Shor’s discovery that quantum computation posed a 
threat to the security of RSA encryption, it was pointed out that Qbits 
(though the term did not exist at the time) offered a quite different and 
demonstrably secure basis for the exchange of secret messages. 

Of all the various possible applications of quantum mechanics to in¬ 
formation processing, quantum cryptography arguably holds the most 
promise for becoming a practical technology. There are several reasons 
for this. First of all, it works Qbit by Qbit. The only relevant gates are 
a small number of simple 1-Qbit gates. Interactions between pairs of 
Qbits like those mediated by cNOT gates play no role, at least in the 
most straightforward versions of the protocol. 

Furthermore, in actual realizations of quantum cryptography the 
physical Qbits are extremely simple. Each Qbit is a single photon 
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of light. The state of the Qbit is the linear polarization state of the 
photon. If the states |0) and |1) describe photons with vertical and 
horizontal polarization, then the states H|0) = (l/v / 2)(|0) + |1>) and 

H11) = (l/v / 2)(|0) — |1)) describe photons diagonally polarized, ei¬ 
ther at 45° or at —45° to the vertical. Photons in any of these four 
polarization states can be prepared in any number of ways, most sim¬ 
ply (if not most efficiently) by sending a weak beam of light through an 
appropriately oriented polaroid filter. Once a photon has been prepared 
in its initial polarization state it does not have to be manipulated any 
further beyond eventually measuring either its horizontal-vertical or 
its diagonal polarization by, for example, sending it through an appro¬ 
priately oriented birefringent crystal and seeing which beam it emerges 
in, or seeing whether it does or does not get through another appropri¬ 
ately oriented polaroid filter. Photons can effectively be shielded from 
extraneous interactions by sending them through optical fibers, where 
they can travel in a polarization-preserving manner at the speed of light. 

This procedure can be viewed as the simplest possible quantum 
computation. First the Qbit is assigned an initial state by sending it 
through a 1-Qbit measurement gate. Then a 1-Qbit unitary gate is 
or is not applied (depending on whether a subsequent polarization 
measurement is to be along the same direction as the first). And finally 
the Qbit is sent through a second 1-Qbit measurement gate. 

The usefulness of easily transportable single Qbits for secret com¬ 
munication stems from one important cryptographic fact: Alice and 
Bob can have an unbreakable code if they share newly created identi¬ 
cal strings of random bits, called one-time codepads. If they both have 
such identical random strings, then Alice can take her message, in the 
form of a long string of zeros and ones, and transform it into its bitwise 
modulo-2 sum (also called the exclusive or or XOR) with a random string 
of zeros and ones of the same length taken from her one-time codepad. 
Flipping or not flipping each bit of a coherent message according to 
whether the corresponding bit of a random string is 0 or 1 converts 
the message into another random string. (If this is not obvious, think 
of the process as flipping or not flipping each bit of the random string, 
according to whether the corresponding bit of the coherent message is 
0 or 1.) Nobody can reconstruct the original string without knowing 
the random string used to encode it, so only Bob can decode the mes¬ 
sage. He does this by taking the XOR of the now meaningless string 
of zeros and ones, received from Alice, with his own copy of the ran¬ 
dom string that she used to do the encoding. The string he gets in this 
way is AT © A © A, where M is the message, A is the random string, 
and AT © A is the encoded message from Alice. Since A © A = 0, Bob 
recovers the original message. 

The problem with one-time codepads is that they can be used only 
once. If an eavesdropper (Eve) picks up two messages encoded with 
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the same pad, she can take the XOR of the two encoded messages. The 
random string used to encode the two messages drops out of the pro¬ 
cess, leaving the XOR of the two unencoded messages. But the XOR 
of two meaningful messages, combined with the usual code-breaking 
tricks based on letter frequencies, can be used (with more subtlety 
than would be required for a single message) to separate and decode 
both texts. So to be perfectly secure Alice and Bob must continu¬ 
ally refresh their one-time codepad with new identical random strings 
of bits. 

The problem of exchanging such random strings in a secure way 
might appear to be identical to the original problem of exchanging 
meaningful messages in a secure way. But at this point quantum me¬ 
chanics comes to the rescue and provides an entirely secure means for 
exchanging identical sequences of random bits. Pause to savor this sit¬ 
uation. Nobody has figured out how to exploit quantum mechanics to 
provide a secure means for directly exchanging meaningful messages. 
The secure exchange is possible only because the bit sequences are 
random. On the face of it one would think nothing could be more use¬ 
less than such a transmission of noise. What is bizarre is that human 
ingenuity combined with human perversity has succeeded in inventing 
a context in which the need to hide information from a third party 
actually provides a purpose for such an otherwise useless exchange of 
random strings of bits. 

The scheme for doing this is known as BB84 after its inventors, 
Charles Bennett and Gilles Brassard, who published the idea in 1984. 
Alice sends Bob a long sequence of photons. For each photon Alice ran¬ 
domly chooses a polarization type for the photon (horizontal-vertical 
or diagonal) and within each type she randomly chooses a polariza¬ 
tion state for the photon - one of the two orthogonal states associated 
with that type of polarization. In Qbit language Alice sends Bob a long 
sequence of Qbits randomly chosen to be in one of four states: 10) (polar¬ 
ized horizontally), |1) (polarized vertically), H10) = (1/a/2)(|0) + |1)) 
(polarized diagonally along 45°), or H11) = (l/\/2)(|0) — |1)) (polar¬ 
ized diagonally along —45°). 

Reverting from photon-polarization language to our more familiar 
quantum-computational language, we divide the four equally likely 
types of Qbits that Alice sends to Bob into two categories: those with 
state 10) or 11), which we call type-1 Qbits, and those with state H |0) or 
H11), which we call type-// Qbits. As each Qbit arrives Bob randomly 
decides whether to send it directly through a measurement gate, or 
to apply a Hadamard and only then send it through a measurement 
gate. We call these two options type-1 and type-// measurements. The 
Qbits must be individually identifiable - for example by the sequence 
in which they arrive - so that Alice and Bob can compare what each of 
them knows about each one. 
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Fig 6.3 


Quantum 
cryptography. For each 
Qbit she sends to Bob, 

Alice randomly decides 
which type of state to 
prepare it in (type 1 means 
\x) and type H means 
HI*)) and which state of 
that type (x = 0 or 1) to 
prepare. For each Qbit he 
receives from Alice, Bob 
randomly decides whether 
(H) or not (1) to apply a 
Hadamard gate before 
measuring it. In those cases 
(about half, enclosed in 
rectangular boxes) for 
which Bob’s choice of 
measurement type is the 
same as Alice’s choice of 
state, they acquire identical 
random bits. When their 
choices differ they acquire 
no useful information. 
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When Bob has measured all the Qbits in this way, Alice tells him 
over an insecure channel which of the Qbits she sent him were type- 
1 and which were type-//. But she does not reveal which of the two 
possible states she prepared within each type: 10) or 11) for type-1 Qbits 
and H10) or H11 > for type-//. For those Qbits (about half of them) for 
which Bob’s random choice of measurement type agrees with Alice’s 
random choice of which type to send, Bob learns from the result of his 
measurement the actual random bit - 0 or 1 - that Alice chose to send. 
For those Qbits (the other half) for which Bob’s choice of which type to 
measure disagrees with Alice’s choice of which type to send, the result 
of his measurement is completely uncorrelated with Alice’s choice of 
bit, and reveals nothing about it. This is illustrated in Figure 6.3. 

Finally, Bob tells Alice, over an insecure channel, which of the Qbits 
he subjected to a type of measurement that agreed with her choice of 
which type to prepare - i.e. which Qbits were of the kind that provides 
them with identical random bits. They discard the useless half of their 
data for which Bob’s type of measurement differed from Alice’s type of 
preparation. They are then able to construct their one-time codepads 
from the identical strings of random bits they have acquired. 

You might wonder why Bob doesn’t wait to decide what type of 
measurement to make on each Qbit until he learns Alice’s choice of 
type for that photon, thereby doubling the number of shared random 
bits. This would indeed be a sensible strategy if Bob could store the 
Qbits he received from Alice. However, storing individual photons 
in a polarization-preserving manner is difficult. For feasible quantum 
cryptography today, Bob must make his decision and measure the po¬ 
larization of each photon as it arrives. 

The reason Alice randomly varies the type of Qbit she sends to 
Bob is to provide security against eavesdroppers. If Alice sent all Qbits 
of the same type, then an eavesdropper, Eve, could acquire the same 
information as Bob without being detected. If, for example, Alice and 
Bob had agreed that all the Qbits would be type-1 and Eve learned 
of this, then she could intercept each Qbit before it reached Bob and 
send it directly through a measurement gate without altering its state, 
subsequently sending it (or another Qbit she prepared in the state she 
just learned) on to Bob. In this way she too could acquire the random 
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bit that Alice sends out and that Bob subsequently acquires when he 
makes his own type-1 measurement. Nothing in the protocol would 
give Bob a clue that Eve was listening in. But by making each Qbit 
secretly and randomly of type 1 or type H Alice deprives Eve of this 
strategy. 

The best Eve can do, like Bob, is to make type-1 or type-// measure¬ 
ments randomly. In doing so she necessarily reveals her presence. Bob 
and Alice can determine that Eve has compromised the security of their 
bits by sacrificing some of the supposedly identical random bits they 
extracted from the Qbits they both ended up treating in the same way. 
They take a sample of these bits and check (over an insecure channel) 
to see whether they actually do agree, as they would in the absence of 
eavesdropping. If Eve intercepts the Qbits, randomly making type-1 or 
type-// measurements of her own before sending them on to Bob, then 
for about half of the useful Qbits her choice will differ from the common 
choice of Alice and Bob. In about half of those cases, Eve’s intervention 
will result in the outcome of Bob’s measurement disagreeing with what 
Alice sent him. If, for example, Eve makes a type-1 measurement of a 
Qbit that Alice has prepared in the state H|0) , then she will necessarily 
change its state to one or the other of the two states |0) or 11). In either 
case if Bob then applies a Hadamard before measuring he will get the 
result 0 only half the time. 

So if Eve is systematically intercepting Qbits, Bob’s result will fail 
to agree with Alice’s preparation for about a quarter of their sample. 
This warns them that the transmission was insecure. If all the sample 
data agree except for a tiny fraction, then they can set an upper limit 
to the fraction of bits that Eve might have picked up, enabling them to 
make an informed judgment of the security with which they can use 
the remaining ones. 

Can Eve do better by a more sophisticated attack, that involved 
capturing each of Alice’s Qbits and processing it in a quantum computer 
that restored it to its initial state, before sending it on to Bob? This 
would eliminate the possibility of her eavesdropping being revealed to 
Bob. But the requirement that Alice’s Qbit be returned to its initial 
state also eliminates the possibility of Eve learning anything useful, for 
reasons rather like our earlier proof of the no-cloning theorem. 

Let |0 M ), ii — 0, ..., 3, be the four possible states of Alice’s Qbit: 
|0), |1), H10), and H|l). Let |0) be the initial state of the n Qbits in 
Eve’s computer, and let U be the (n + 1)-Qbit unitary transformation 
the computer executes on its own Qbits and Alice’s. Since Alice’s Qbit 
must emerge in its original state, we have 

u(|^>®|$>) = |0 / a®|'P #t >. (6.6) 

Eve’s hope is to devise a U that yields four 1^) whose differences 
enable her, by subsequent processing and measurement, to extract 
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useful information about which of the four possible states |0 M ) was. 
But unitary transformations preserve inner products, so 

<0 y |(/v)<O|O) = (0 V |0 /X )(^ V |^ /X ). (6.7) 

Because (O | O) = 1 and because (0 y 10^) ^ 0 for /zv = 02, 03, 12, 13, 
it follows that 


= 1, /zv = 02, 03, 12, 13. (6.8) 

Since the inner product of two normalized states can be 1 only if they 
are identical, it follows from (6.8) that 

l*0> = |*1> = l^2> = 1*3). (6.9) 

The price Eve pays for eliminating all traces of her eavesdropping is 
that the resulting state of her quantum computer can teach her nothing 
whatever about the four possible states of Alice’s Qbit. 

There is a less practical version of this cryptographic protocol that 
appears, at first sight, to be different, but turns out to be exactly the 
same. Suppose that there were some central source that produced pairs 
of Qbits in the entangled state 

l*> = ^(|00> + |11>), (6.10) 

and then sent one member of each pair to Alice and the other to Bob. 
One easily verifies that 

(H ® H)-^(|00} + |11>) = ^(|00) + |11}), (6.11) 

so if Alice and Bob make measurements of the same type, they will get 
identical random results. 

This might seem even more secure than the first protocol, since 
the Qbits are in an entangled state until Alice or Bob actually makes a 
measurement. The correlated bits - the outcomes of the measurement - 
do not even exist until a measurement has been made, and that does 
not happen until both Qbits are safely in Alice’s and Bob’s separate 
possession. But this is only the case if Eve does not intercept a Qbit. If 
she does measure one before it gets to Bob or Alice, then the correlated 
bits do come into existence at the moment of her own measurement. 
This is later than in the first protocol (when each bit exists from the 
moment Alice performs her measurement) but early enough to help 
Eve in the same way as before. 

If Alice and Bob decided to produce their perfectly correlated ran¬ 
dom bits by always making type-1 measurements then if Eve finds this 
out she can intercept one member of the pair with type-1 measurements 
of her own, disentangling the state prematurely, but in a way that en¬ 
ables her to learn what each random bit is, while not altering the perfect 
correlations between the values Alice and Bob will subsequently mea¬ 
sure. Alice and Bob can guard against this possibility by each randomly 
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(and, necessarily, independently) alternating between type-1 and type- 
H measurements, and then following a procedure identical to the one 
they used when Alice sent Bob Qbits in definite states. 

This returns us to the original protocol that made no use of entangled 
pairs. Indeed, if Alice measures her member of the entangled pair 
(making either a type-1 or a type -H measurement) before Bob measures 
his, this is equivalent to her sending Bob a Qbit with a randomly selected 
state that she knows. The only difference is that now the random choice 
of which of the two states to send within each type is not made by 
Alice tossing a coin, but by the basic laws of quantum mechanics that 
guarantee that the outcome of her own measurement is random. 


6.3 Bit commitment 

One can try to formulate a similar protocol for a procedure called bit 
commitment. Suppose that Alice wishes to assure Bob that she has 
made a binary decision by a certain date, but does not wish to reveal 
that decision until some future time. She can do this by writing YES or 
NO on a card, putting the card in a box, locking the box, and sending 
the box, but not the key, to Bob. Once the box is in Bob’s possession he 
can be sure that Alice has not altered her decision, but while the key is 
in Alice’s possession she can be sure that Bob has not learned what that 
decision was. When it is time for her to reveal the decision she sends 
the key to Bob who opens the box and learns what it was. 

Of course Alice might worry about Bob breaking into the box by 
other means. Quantum mechanics offers a more secure procedure (but 
with an exotic loophole, which we return to momentarily). Alice pre¬ 
pares a large number n of labeled Qbits. If her answer is YES, she takes 
each Qbit to be randomly in the state |0) or the state 11). If her answer 
is NO, she prepares each Qbit randomly in the state H|0) or H|l). In 
either case she notes which Qbits are in which state, and then sends 
them all off to Bob, who stores them in a way that preserves both their 
state and their labels. (As noted above, such storage is beyond the range 
of current technology for polarized photons.) 

If Bob has a collection of n Qbits, each of which has been chosen 
with equal probability to be in one of two orthogonal states \(p) and |t/t), 
then there is no way for Bob to get any hint of what the two orthogonal 
states are. If, for example, he measures every Qbit, then the probability 
of getting 0 is 


/>(()) = il<0|(/>}| 2 + !l<0|V0| 2 . (6.12) 


But 


|<O|0)| 2 + |<O|VOI 2 = 1. 


(6.13) 
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since this is the sum of the squared moduli of the amplitudes of the 
expansion of |0) in the orthonormal basis given by |0) and \\jr): 

|0) = |0)(0|O) + |i/0(iAIO). (6.14) 

So ^(0) = Bob’s measurement outcomes are completely random, 
regardless of what the orthogonal pair of states actually is. 

In Appendix P it is shown, more generally, that no information Bob 
can extract from his collection of Qbits can distinguish between the 
case in which each has a 50-50 chance of being in the state |0) or |1) 
and the case in which each has a 50-50 chance of being in the state 
H|0) or H| 1). There is no way Bob can learn Alice’s choice from the 
Qbits that Alice has sent him. He cannot break into the locked box. 

(It is crucial for Bob’s inability to learn Alice’s choice that, regardless 
of what that choice is, she sends him a collection of Qbits each of whose 
two possible states is picked randomly. If, for example, she sent him 
exactly \n Qbits in the state |0) and in the state 11), in some random 
order, then with probability 1 Bob would get an equal number of zeros 
and ones if he measured in the computational basis. But if he applied H 
before measuring, the outcome of each measurement would be random, 
and the probability of getting equal numbers of zeros and ones for his 
measurements would be quite small (asymptotically ^l/iyin )) for large 
n . So if he got equal numbers of zeros and ones he could be rather sure 
that Alice had sent him photons in the states |0) and 11) rather than in 
the states H|0) and H| 1).) 

When the time comes for Alice to reveal her choice for the pair of 
orthogonal states, she says to Bob something like this: “My answer was 
YES, so each of the Qbits I sent you was either in the state |0) or in the 
state 11). To prove this I now tell you that I put Qbits 1, 2, 4, 6, 7, 11, 

... into the state |0) and I put Qbits 3, 5, 8, 9, 10, 12, ... into the state 
11). You can confirm that I’m telling the truth by measuring each Qbit 
directly.” 

Bob makes the direct measurements and gets every one of Alice’s 
predicted outcomes. If instead Alice had sent him Qbits whose states 
were randomly H10) or H11) she could do the same trick by telling Bob 
exactly what he would find if he preceded each of his measurements 
with a Hadamard gate. But there is no way she could do the trick 
for measurements preceded by Hadamard gates in the first case or 
for direct measurements in the second. The best she could do if she 
wanted to deceive Bob would be to make random guesses for each 
outcome, and with n Qbits she would succeed in fooling him only with 
probability 1/2”. So this works perfectly well, and without the worry 
of Bob possessing unexpected safe-cracking skills. 

But, as noted above, there is a loophole - in fact, a fatal problem. 
The technological skills required to take advantage of the loophole are 
spectacularly greater than those required for the naive protocol, so one 
could imagine a stretch of years, decades, or even centuries during 
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which the naive protocol might actually be useful. But ultimately it 
will be insecure. Suppose that Alice, unknown to Bob, has actually 
prepared n labeled pairs in the entangled state (6.10), sending one 
member to Bob while retaining the other for herself. Then the Qbits 
Bob receives will have no states of their own, being entangled with the 
Qbits Alice keeps for herself. Nevertheless, if Bob chooses to test some 
of them with measurements, (6.11) insures that the results he gets will 
be indistinguishable from the random outcomes he would have got if 
Alice had been playing the game honestly. No hint of her deception 
will be revealed by any test Bob can perform. 

But now when the time comes for Alice to reveal her choice, if she 
wants to prove to Bob that it was YES, she makes a direct measurement 
on each of the Qbits she has kept and correctly informs Bob what he will 
get if he makes a direct measurement on each of the paired Qbits. But if 
she wants to prove that it was NO, she instead applies Hadamards before 
measuring each of her Qbits, enabling her, because of the identity (6.11), 
to tell Bob what he will find if he also applies Hadamards before mea¬ 
suring his own Qbits. So she can use entangled pairs of Qbits to cheat at 
what would otherwise be a perfectly secure bit-commitment protocol. 

Alice can cheat in the same way even if Bob measures his Qbits 
(randomly applying or not applying a Hadamard before each measure¬ 
ment) before she “reveals” her commitment. If she wants to “prove” 
to Bob she had sent him YES she directly measures each of her Qbits 
and tells Bob all her results. He notes that they do indeed agree with 
all the results he found for his direct measurements, and is persuaded 
that she had indeed sent him YES. To “prove” she sent him NO she 
applies Hadamards before measuring each of her Qbits. 

Of course the success of Alice’s cheating depends crucially on Bob’s 
knowing all about 1-Qbit states, but never having taken the kind of 
course in quantum mechanics that would have taught him anything 
about entangled 2-Qbit states. If Bob is as sophisticated a student of 
the quantum theory as Alice, they will both realize that the protocol is 
fatally flawed, since it can be defeated by entanglement. 

It is in this context that Einstein’s famous complaint about spooky 
actions at a distance (“spukhafte Fernwirkungen ”) seems pertinent. By 
finally measuring her members of the entangled pairs, Alice seems to 
convert the distant Qbits in Bob’s possession into the kind she decep¬ 
tively said she had sent him long ago, while retaining until the last 
minute the option of which of the two kinds to pick. But of course 
Alice’s action is not so much on the Qbits in Bob’s possession as it is on 
what it is possible for her to tell him about what he can learn from those 
Qbits. It is this peculiar tension between what is objective (ontology) 
and what is known (epistemology) that makes quantum mechanics such 
a source of delight (or anguish) to the philosophically inclined. 

Something like Alice’s discovery of the value of entanglement for 
cheating actually happened in the historical development of these ideas 
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about quantum information processing. When the bit-commitment 
protocol described above was first put forth it was realized that entan¬ 
gled pairs could be used to thwart it, but more sophisticated versions 
were proposed that were believed to be immune to cheating with en¬ 
tanglement. There developed a controversy over whether some form 
of bit commitment could or could not be devised that would be secure 
even if entanglement were fully exploitable. The current consensus is 
that there is no way to use Qbits in a bit-commitment protocol that 
cannot be defeated by using entangled states. Indeed, it has even been 
suggested that the structure of quantum mechanics might be uniquely 
determined by requiring it to enable the secure exchange of random 
strings of bits, as in quantum cryptography, but not to enable bit com¬ 
mitment. Nobody has managed to show this. It does seem implausible 
that God would have taken as a fundamental principle of design that 
certain kinds of covert activity should be possible while others should 
be forbidden. 


6.4 Quantum dense coding 

Although an infinite amount of information is needed to specify the 
state |V0=a'|0)+/3|l)ofa single Qbit, there is no way for somebody 
who has acquired possession of the Qbit to learn what that state is, as we 
have often noted. If Alice prepares a Qbit in the state | \fr) and sends it 
to Bob, all he can do is apply a unitary transformation of his choice and 
then measure the Qbit, getting the value 0 or 1. After that the state of 
the Qbit is either 10) or 11) and no further measurement can teach him 
anything about its original state \ \j /). The most Alice can communicate 
to Bob by sending him a single Qbit is a single bit of information. 

If, however, Alice has one member of an entangled pair of Qbits in 
the state 


m = 7l(|0>|0> + |l}|l» (6.15) 

and Bob has the other, then by suitably preparing her member of the 
pair and then sending it to Bob, she can convey to him two bits of 
information. She does this by first applying the transformation 1, X, Z, 
or ZX to her Qbit, depending on whether she wants to send Bob the 
message 00, 01, 10, or 11. If hers is the Qbit on the left in (6.15) these 
transform the state of the pair into one of the four mutually orthogonal 
Bell states (6.5), 

hm = ^(io>io> + ii>ii>), 

XJ*> = ^(|1>|0> + |0}|1>), 

zj*) = ^(|0>|0>-|1)|1>), 

Z a xj*) = ^(|0}|1}-|1)|0>). 


(6.16) 
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She then sends her Qbit over to Bob. He sends the pair through the 
controlled-NOT gate C^, using the Qbit he received from Alice as 
control, to get 


C^IJ*) = ^(|0> + |1>)|0>, 
C ai x a |^) = ^(|0) + |l>)|l>. 
C ai z a |^) = ^(|0 )-|l>)|0>. 

C ab Z a X a \^) = -L(|0}-|1>)|1>, 
and then he applies a Hadamard transform to get 

H.C^IJ*) = |0)|0), 
H«C MV) = |0)|1), 

H B C a4 Z a |»P> = |1)|0), 
H a C a »Z a X a |'P> = |1}|1). 


(6.17) 


(6.18) 


Measuring the two Qbits then gives him 00, 01, 10, or 11 - precisely 
the two-bit message Alice wished to send. 

This process of transforming the Bell basis back into the compu¬ 
tational basis - i.e. undoing the process (6.3) by which the Bell basis 
was constructed from the computational basis - and then measuring is 
called “measuring in the Bell basis.” 

One can directly demonstrate that this works with circuit diagrams, 
without going through any of the analysis in (6.15)—(6.18). Suppose 
that Alice represents the two bits x and y she wishes to transmit to Bob 
as the computational-basis state \x) | y) of two Qbits (the top two wires, 
Figure 6.4(a)). If Bob has two Qbits initially in the state |0) 10) (the 
bottom two wires in Figure 6.4(a)), then the circuit in Figure 6.4(a) 
gets the two bits to Bob in a straightforward classical way, transforming 
the state \x) | y) 10) 10) on the right to \x) \y)\x) | y) on the left by means 
of direct Qbit-to-Qbit coupling via two cNOT gates. The procedure 
involves only classical operations on classically meaningful states. It 
gets the two bits from Alice to Bob by explicit interactions between her 
Qbits and his. It would work equally well for Chits. 

One can transform this direct classical procedure into the more 
exotic quantum protocol by expanding the cNOT gates into products 
of quantum gates. One first expands one of the C gates into HC H in 
Figure 6.4(b). Because Z acting on the control Qbit commutes with C 
and because C is its own inverse, we can further expand Figure 6.4(b) 
to Figure 6.4(c). We can then bring the H and C gates on either side 
of the C z to the extreme left and right to get Figure 6.4(d). We can 
also expand the two C gates on the left of Figure 6.4(d) into the three 
C gates on the left of Figure 6.4(e), since the action of either set is to 
flip the target Qbit if and only if the computational-basis states of the 
two control Qbits are different, while leaving the states of the control 
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Fig 6.4 


A circuit-theoretic 
derivation of the quantum 
dense-coding protocol. 
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Qbits unaltered. Because the state H|0) = (1 /a/ 2)(|0) + 11 >) is invari¬ 
ant under the action of X, the C on the extreme left of Figure 6.4(e) 
acts as the identity, and Figure 6.4(e) simplifies to Figure 6.4(f). 

The fact that Figure 6.4(f) has the same action as Figure 6.4(a) 
contains all the content of the dense-coding protocol. The pair of gates 
CioHi on the left of Figure 6.4(f) acts on the state |0)|0) to produce 
the entangled state (6.15). The bottom Qbit of the pair, Qbit 0, is given 
to Bob and the upper one, Qbit 1, is given to Alice, who also possesses 
the upper two, Qbits 2 and 3. The pair of gates C 21 acts as 1, X, Z, 
or ZX on Qbit 1 depending on whether the states of Qbits 3 and 2 
are 10) 10), 10) 11), 11) 10), or 11) 11). This reproduces the transformation 
Alice applies to the member of the entangled pair in her possession, 
depending on the values of the two bits she wishes to transmit to Bob. 
Alice then sends Qbit 1 to Bob. The final pair HiC 10 on the right is 
precisely the transformation (6.18) that Bob performs on the reunited 
entangled pair before making his measurement, which yields the values 
v, y that Alice wished to transmit. 

Like dense coding, many tricks of quantum information theory, in¬ 
cluding the one we examine next, teleportation, rely on two or more 
people sharing entangled Qbits, prepared some time ago, carefully 
stored in their remote locations awaiting an occasion for their use. 
While the preparation of entangled Qbits (in the form of photons) and 
their transmission to distant places has been achieved, putting them 
into entanglement-preserving, local, long-term storage remains a dif¬ 
ficult challenge. 


6.5 Teleportation 

Suppose that Alice has a Qbit in a state 

\ir) =a|0)+j8|l>, (6.19) 

but she does not know the amplitudes a and /3. Carol, for example, may 
have prepared the Qbit for Alice by taking a Qbit initially assigned the 
standard state | 0 ), applying a specific unitary transformation to it, and 
then giving it to Alice, without telling her what unitary transformation 
she applied. 

Alice would like to reassign that precise state to another Qbit pos¬ 
sessed by Bob. Neither Alice nor Bob (who could be far away from 
Alice) has any access to the other’s Qbit. Alice is, however, allowed to 
send “classical information” to Bob - e.g. she can talk to him over the 
telephone. And, crucially, Bob’s Qbit shares with a second Qbit of Alice 
the 2-Qbit entangled state 

m = ^(10)10)+ ii>ii>). 


( 6 . 20 ) 
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The no-cloning theorem prohibits duplicating the unknown state 
of Alice’s first Qbit, either far away from her or nearby. But it turns 
out to be possible for Alice and Bob to cooperate over the telephone 
in assigning the state \\j/) to Bob’s member of the entangled pair. The 
no-cloning theorem is not violated because in doing so Alice obliterates 
all traces of the state | \/f) from either of her own Qbits. The process - 
called teleporting the state from Alice to Bob - also eliminates the en¬ 
tanglement Alice and Bob formerly shared. For each shared entangled 
pair, they can teleport just a single 1-Qbit state. The term “teleporta¬ 
tion” emphasizes that the state assignment acquired by Bob’s Qbit no 
longer applies to Alice’s; it has been transported from her Qbit to his. 

Here is how teleportation works. Alice’s first Qbit and the entangled 
pair she shares with Bob are characterized by the 3-Qbit state 

Ma \*>>.* = (a|0) a +/J|l> a )^(|0>J0>*+ |1>J1)*), (6.21) 

where I have given the state symbols for the Qbits in Alice’s and Bob’s 
possession subscripts a and b . To teleport the unknown state of her 
Qbit to Bob’s member of the entangled pair, Alice first applies a cNOT 
gate, using her first Qbit in the state | \ff) as the control and her member 
of the shared entangled pair as the target. This produces the 3-Qbit 
state 

a|0> a ^(|0} a |0>* + |l) a |l>,) +/?|1>^(|1>J0>* + |0) a |l),). 

( 6 . 22 ) 

Next she applies a Hadamard transformation H to her first Qbit, giving 
all three Qbits the state 

a^(|0> a + |l> a )^(|0>J0>* + |l>Jl>*) 

= £|0> a |0> a (a|0), +/3|l>,) + i|l)J0) B (a|0>, - fl\\) b ) 

+ i|0> a |l> a (a|l>^+^|0>,) + i|l> a |l> a (a|l>, -mi,)- (6.23) 

Now Alice measures both Qbits in her possession. (As remarked 
in Section 6.4, such an application of cNOT and Hadamard gates, 
immediately followed by measurement gates, is called “measuring in 
the Bell basis.”) If the result is 00, Bob’s Qbit will indeed acquire the 
state \i/f) originally possessed by Alice’s first Qbit (whose state would 
then be reduced to |0)). But if the result of Alice’s measurement is 10, 
01, or 11 then the state of Bob’s Qbit becomes 

a\0)b ~ j8|l)*, a|l)* + )8|0)*, or a\\) h - /3\0) b . (6.24) 

In each of these three cases there is a unitary transformation that re¬ 
stores the state of Bob’s Qbit to Alice’s original state \i/f). In the first 
case we can apply Z (which leaves |0) alone but changes the sign of 11)), 
in the second case, X (which interchanges |0) and 11)), and in the third 
case, ZX. 
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So all Alice need do to transfer the state of her Qbit to Bob’s member 
of their entangled pair is to telephone Bob and report to him the results 
of her two measurements. He then knows whether the state has already 
been transferred (if Alice’s result is 00) or what unitary transformation 
he must apply to his member of the entangled pair in order to com¬ 
plete the transfer (if Alice’s result is one of the other three.) Note the 
resemblance to quantum error correction: by making a measurement 
Alice acquires the information needed for Bob to reconstruct a partic¬ 
ular quantum state, without anybody acquiring any information about 
what the state actually is. 

This appears to be remarkable. A general state of a Qbit is described 
by two complex numbers a and /3 that take on a continuum of values, 
constrained only by the requirement that \a | 2 + \/3 \ 2 = 1. Yet, with the 
aid of a standard entangled pair, whose state does not depend on a and 
/?, Alice is able to provide Bob with a Qbit described by the unknown 
state, at the price of only two bits of classical information (giving the 
results of her two measurements) and the loss of the entanglement of 
their pair. 

But of course the teleportation process does not communicate to Bob 
the information that can be encoded in a and /3. Bob is no more able to 
learn the values of a and /3 from manipulating his Qbit, now assigned 
the state | i/f), than Alice was able to do when it was her Qbit that was 
assigned the same state On the other hand Alice’s state could be 
produced at a crucial stage of an elaborate quantum computation, and 
its transfer to Bob could enable him to continue with the computation 
on his own far-away quantum computer, so one can achieve a nontrivial 
objective by such teleportations. 

Like dense coding, teleportation can also be constructed by manip¬ 
ulating an elementary classical circuit diagram, without going through 
any of the analysis in (6.21)—(6.24). Figure 6.5(a) shows a circuit that 
exchanges the state |i/f) = \x) of Alice’s Chit with the state |0) of Bob’s 
Chit, regardless of whether x = 0 or 1. The transfer is achieved by 
direct physical coupling between the two Chits. As a linear quantum 
circuit it continues to perform this exchange for arbitrary superpo¬ 
sitions, \\j/) = o'10) +/3|1). The entire teleportation protocol can be 
constructed by appropriately expanding the two gates in Figure 6.5(a), 
with the aid of an ancillary Qbit. The aim of the expansion is to elim¬ 
inate the direct interaction between Alice’s and Bob’s Qbits through 
the two cNOT gates in Figure 6.5(a), in favor of the telephoned mes¬ 
sage from Alice to Bob, and the interaction necessary to produce their 
shared pair of entangled Qbits (which can take place well before Alice 
has even acquired her Qbit in the state |V0)- 

In Figure 6.5(b) an ancillary Qbit, not acted upon throughout the 
process, is introduced in the state 

!</>} = H|0} = ^(|0) + |1». 


(6.25) 
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In Figure 6.5(c) the identities X = HZH and 1 = HH have been 
used to rewrite the cNOT gate on the right of Figure 6.5(b), and an 
additional cNOT gate has been added on the left, which acts as the 
identity, since X acts as the identity on the state H|0). 

Figure 6.5(d) follows from Figure 6.5(c) because the action of C 
is independent of which Qbit is the control and which the target, and 
because the two cNOT gates on the left of Figure 6.5(c) have exactly 
the same action as the three cNOT gates on the left of Figure 6.5(d): 
acting on the computational basis, both sets of gates apply X on both 
of the bottom two wires if the state of the top wire is 11 ) and act as the 
identity if the state of the top wire is | 0 ). 

Figure 6.5(e) follows from Figure 6.5(d) if we write the 10) on the 
left of Figure 6.5(d) as H|0) and explicitly write the |0) on the right 
of Figure 6.5(d) as H|0). But Figure 6.5(e) is an automated version 
of teleportation. To relate it to ordinary teleportation, introduce mea¬ 
surements of the upper two Qbits after the circuit of Figure 6.5(e) has 
acted, as in Figure 6.5(f). Their effect is to collapse the states of each 
of the two upper wires randomly and independently to |0) or |1). But 
as noted in Section 3.6, measurement of a control Qbit commutes with 
any operation controlled by that Qbit, so the measurement gates can 
be moved to the positions they occupy in Figure 6.5(g). 

Figure 6.5(g) is precisely the teleportation protocol. The two gates 
on the left transform the two lower Qbits into the entangled state 
(6.20). The subsequent applications to the top two Qbits of cNOT 
followed by H followed by two measurement gates are precisely Alice’s 
“measurement in the Bell basis.” Since Alice knows the outcomes of 
the measurements, she knows whether the subsequent cNOT and C 
gates will or will not act, and she can replace these physical couplings 
by a phone call to Bob telling him whether or not to apply X and/or Z 
directly to his own Qbit. 

Figure 6.6 demonstrates that entanglement can also be teleported. 
The figure reproduces parts (b), (e), and (g) of Figure 6.5 with three 
changes. (1) A bar representing n Qbits in the //-Qbit state | 0 ) z has 
been added above each part of the figure. No operations act on these 
additional Qbits. (2) The state to be teleported has been given a sub¬ 
script i so it is now one of several possible states 1 1 // 7 ). (3) Because 
of the linearity of the unitary gates we may sum over the index 1 . 
The effect of the circuit is to transfer participation in the entangled 
state JT \<$>i)\ifi) from the third wire from the bottom to the bottom 
wire. 

So even if Alice’s Qbit has no state of its own but is entangled with 
other Qbits, Alice can use the same protocol to teleport its role in 
the entangled state over to Bob’s Qbit. The result is that Bob’s Qbit 
becomes entangled in exactly the same way Alice’s was, and Alice’s 
Qbit becomes entirely unentangled. 
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Fig 6.6 


Figure 6.6. A 
demonstration that 
entanglement can be 
teleported. 
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6.6 The GHZ puzzle 

We conclude with another illustration of just how strange the behavior 
of Qbits can be. The situation described below is a 3-Qbit version of 
one first noticed by Daniel Greenberger, Michael Horne, and Anton 
Zeilinger (“GHZ”) in the late 1980s, which gives a very striking version 
of Bell’s theorem. An alternative version, discovered by Lucien Hardy 
in the early 1990s, is given in Appendix D. 

Consider the 3-Qbit state 

|'I') = !(|000) - |110) - |011} - |101}). (6.26) 

Note that the form of | \h) is explicitly invariant under any permutation 
of the three Qbits. Numbering the Qbits from left to right 2, 1, and 0, 
we have 


I'p) = C 21 H 2 X 2 -) ! (|000} - 1111}). (6.27) 

Since 

Ai(|000) - 1111}) = C 2 iC 20 H 2 X 2 |000), (6.28) 
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(6.27) and (6.28) provide an explicit construction of |^) from elemen¬ 
tary 1- and 2-Qbit gates acting on the standard state |0>3. 

Because | \^) in the form (6.26) and the state (1 / \/2)( 1000) — 1111)) 
appearing in (6.27) are both invariant under permutations of the Qbits 
0, 1, and 2, any of the other five forms of (6.27) associated with permu¬ 
tations of the subscripts 0, 1, and 2 are equally valid. In particular 

l*> = C 12 H 1 X 1 -^(|000) - 1111}). (6.29) 

It follows from (6.29) that 


H 2 HihP> = H 2 HiC 12 HiX 1; ^(|000) - |111» 

= (H 2 HiC 12 HiH 2 )H 2 X 1 -) ! (|000} - |111}) 
= C 21 H 2 X 1 (|000} - 1111}) 


(6.30) 


(since sandwiching a cNOT between Hadamards exchanges target and 
control Qbits). Comparing the last expression in (6.30) with the form 
of hP) in (6.27) reveals that 


H 2 H 1 |'P) = Z 2 Xi|'P) (6.31) 

(which can, of course, be confirmed more clumsily directly from the 
definition (6.26) of |*P).) Because of the invariance of |*P) under per¬ 
mutation of the three Qbits we also have 


H 2 Ho|^) = Z 2 Xo|^), 

(6.32) 

H^ol'P) = Z 1 X 0 |'P). 

(6.33) 


Now suppose that we have prepared three Qbits in the state | *P) and 
then allowed no further interactions among them. If we measure each 
Qbit, it follows from the form (6.26) that because | *P) is a superposition 
of computational-basis states having either none or two of the Qbits in 
the state 11), the three outcomes are constrained to satisfy 

xz © x\ © xq = 0 (6.34) 

(where ©, as usual, denotes addition modulo 2). 

Suppose, on the other hand, that we apply Hadamards to Qbits 
2 and 1 before measuring all three. According to (6.31) this has the 
effect of flipping the state of Qbit 1 in each term of the superposition 
(6.26) (and changing the signs of some of the terms). As a result the 
3-Qbit state (6.26) is changed into a superposition of computational- 
basis states having either one or three of the Qbits in the state |1). So 
if the outcomes are v/ 7 , and vo, we must have 

X 2 © Xi © xq = 1 . 


(6.35) 
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Similarly, if we apply Hadamards to Qbits 2 and 0 before measuring all 
three, then (6.32) requires that the outcomes must obey 

Xi © x\ © Xq = 1, (6.36) 

and if Hadamards are applied to Qbits 1 and 0 then according to (6.33) 
if all three are measured we will have 

X 2 ® ® Xq = 1. (6.37) 

Consider now the following question. If we are talking about a single 
trio of Qbits, assigned the state |^), must the xo appearing in (6.34) be 
the same as the xo appearing in (6.35)? A little reflection reveals that 
this question makes no sense. After all, (6.34) describes the outcomes of 
immediately measuring the three Qbits, whereas (6.35) describes the 
outcomes of measuring them after Hadamards have been applied to 
Qbits 2 and 1. Since only one of these two possibilities can actually be 
carried out, there is no way to compare the results of measuring Qbit 
0 in the two cases. You can’t compare the x$ you found in the case you 
actually carried out with the xq you might have found in the case you 
didn’t carry out. It’s just a stupid question. 

Or is it? Suppose that Qbits 2 and 1 are measured before Qbit 0 is 
measured. If no Hadamards were applied before the measurements of 
2 and 1, then (6.34) assures us that when 0 is finally measured the result 
will be 


Xq = X\ © Xi- (6.38) 

So the outcome of measuring Qbit 0 is predetermined by the outcomes 
of the earlier measurements of Qbits 2 and 1. Since all interactions 
among the Qbits ceased after the state |'P) had been prepared, subject¬ 
ing Qbits 2 and 1 to measurement gates can have no effect on Qbit 0. 
Since the outcomes of the measurements of Qbits 2 and 1 determine 
in advance the outcome of the subsequent measurement of Qbit 0, it 
would seem that Qbit 0 was already predisposed to give the result (6.38) 
upon being measured. Because the Qbits did not interact after their 
initial state was prepared, it would seem that Qbit 0 must have had 
that predisposition even before Qbits 2 and 1 were actually measured 
to reveal what the result of measuring Qbit 0 would have to be. 

This is a bit disconcerting, since prior to any measurements the 
state of the Qbits was (6.26), in which none of them was individually 
predisposed to reveal any particular value. Indeed, it would seem that 
the 3-Qbit state (6.26) gives an incomplete description of the Qbits. 
The omitted predisposition of Qbit 0 seems to be an additional element 
of reality that a more complete description than that afforded by the 
quantum theory would take into account. 
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But if Qbit 0 did indeed have a predetermined predisposition to give 
vo when measured, even before Qbits 1 and 2 were measured to reveal 
what vo actually was, then the value of vo surely would not be altered if 
Hadamards were applied to Qbits 1 and 2 before they were measured, 
since the Qbits have ceased to interact, and the predisposition to give 
vo was present before the decision to apply Hadamards or not had been 
made. This means that the value vo appearing in (6.34) must indeed be 
identical to the value of vo appearing in (6.35). So our question is not 
meaningless. The answer is Yes! 

Such an argument for elements of reality - predetermined values - 
was put forth in 1935 (in a different context) by Albert Einstein, Boris 
Podolsky, and Nathan Rosen (EPR). The controversy and discussion 
it has given rise to has steadily increased over the past seven decades. 
The terms “incomplete” and “element of reality” originated with EPR. 
Today it is Einstein’s most cited paper. 

The wonderful thing about three Qbits in the state (6.26) is that 
they not only provide a beautiful illustration of the EPR argument, 
but also, when examined further, reveal that the appealing argument 
establishing predetermined measurement outcomes cannot be correct. 
To see this, note that exactly the same reasoning establishes that the 
values of x\ appearing in (6.34) and (6.36) must be the same, as well as 
the values of V 2 appearing in (6.34) and (6.37). And the same line of 
thought establishes that the values of v ^ in (6.37) and (6.36) must be 
the same, as well as the values of v/ 7 in (6.37) and (6.35) and the values 
of vf in (6.36) and (6.35). 

If all this is true, then adding together the left sides of (6.34)—(6.37) 
must give 0 modulo 2, since each of V 2 , vi, vo, v^, v/ 7 , and v ^ appears 
in exactly two of the equations. But the modulo 2 sum of the right sides 
isO® 1 ® 1 © 1 = 1. 

So the appealing EPR argument must be wrong. There are no el¬ 
ements of reality - no predetermined measurement outcomes that a 
more complete theory would take into account. The answer to what 
is mistaken in the simple and persuasive reasoning that led Einstein, 
Podolsky, and Rosen to the existence of elements of reality is still a mat¬ 
ter of debate more than 70 years later. How, after all, can Qbit 0 and its 
measurement gate “know” that if they interact only after Qbits 1 and 2 
have gone through their own measurement gates (and no Hadamards 
were applied) then the result of the measurement of Qbit 0 must be 
given by (6.38)? 

The best explanation anybody has come up with to this day is to 
insist that no explanation is needed beyond what one can infer from 
the laws of quantum mechanics. Those laws are correct. Quantum 
mechanics works. There is no controversy about that. What fail to 
work are attempts to provide underlying mechanisms, that go be¬ 
yond the quantum-mechanical rules, for how certain strong quantum 
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correlations can actually operate. One gets puzzled only if one tries to 
understand how the rules can work not only for the actual situation 
in which they are applied, but also in alternative situations that might 
have been chosen but were not. 

By concluding with this “paradoxical” state of affairs, I am not 
suggesting that there is anything wrong with the quantum-theoretic 
description of Qbits and the gates that act on them. On the contrary, the 
quantum theory has to be regarded as the must accurate and successful 
theory in the history of physics, and there is no doubt whatever among 
physicists that if the formidable technological obstacles standing in 
the way of building a quantum computer can be overcome, then the 
computer will behave exactly as described in the preceding chapters. 

But I cannot, in good conscience, leave you without a warning that 
the simple theory of Qbits developed here, though correct, is in some 
respects exceedingly strange. The strangeness emerges only when one 
seeks to go beyond the straightforward rules enunciated in Chapter 1. 
In particular one must not ask for an underlying mechanism that ac¬ 
counts not only for the behavior of the circuit actually applied to a 
particular collection of Qbits, but also for the possible behavior of 
other circuits that might have been applied to the very same collection 
of Qbits, but were not. 

A good motto for the quantum physicist and for future quantum 
computer scientists might be “What didn’t happen didn't happen .” On 
that firm note I conclude (except for the 16 appendices that follow). 



Appendix A 

Vector spaces: basic properties 
and Dirac notation 


In quantum computation the integers from 0 to N are associated with 
N + 1 orthogonal unit vectors in a vector space of D = N + 1 dimen¬ 
sions over the complex numbers. The nature of this association is the 
subject of Chapter 1. Here we review some of the basic properties of 
such a vector space, while relating conventional vector-space notation 
to the Dirac notation used in quantum computer science. Usually the 
dimension D is a power of 2, but this does not matter for our summary 
of the basic facts and nomenclature. 

In conventional notation such a set of D = N + 1 orthonormal 
vectors might be denoted by symbols such as </>o> 0i> 02 > • • (/>n- The 
orthogonality and normalization conditions are expressed in terms of 
the inner products (</> r , (j) y )\ 


(0x» 0j/) — 



x + y; 

x =y. 


(A.l) 


In quantum computation the indices x and y describing the integers 
associated with the vectors play too important a role to be relegated 
to tiny fonts in subscripts. Fortunately quantum mechanics employs 
a notation for vectors, invented by the physicist Paul Dirac, which is 
well suited for representing such information more prominently. One 
replaces the symbols <p x and <p y by \x) and |j/), and represents the inner 
product (0 r , (fiy) by the symbol (x\y). The orthonormality condition 
(A.l) becomes 


(x\y) = 


0 , x^y; 

1, x — y. 


(A.2) 


Vectorial character is conveyed by the symbol | ), with the specific 
vector being identified by whatever it is that goes between the bent 
line ) and the vertical line | . This notational strategy is reminiscent of 
the notation for vectors in ordinary three-dimensional physical space 
(which we will use here for such vectors) in which vectorial character 
is indicated by a horizontal arrow above a symbol denoting the specific 
vector being referred to: "rA 

Symbols like 0 and 0 remain useful in the notation of quantum com¬ 
putation for representing generic vectors, but for consistency with the 
notation for vectors associated with specific integers, and to emphasize 
their vectorial character, they too are enclosed between a bent line ) 
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and a vertical line |, becoming 10) and |0). Some mathematicians 
disapprove of this practice. Why write |0), introducing the spurious 
symbols ) and |, when 0 by itself does the job perfectly well? This gets 
it backwards. The real point is that the important information - for 
example the number 7798 - is easier to read in the form 17798) than 
when presented in small print in the form 07798- Why introduce in a 
normal font the often uninformative symbol 0, at the price of demoting 
the most important information to a mere subscript? 

The vector space that describes the operation of a quantum com¬ 
puter consists of all linear combinations |0) of the TV + 1 orthonormal 
vectors \x),x = 0, ..., TV, with coefficients a x taken from the complex 
numbers: 


N 

|0) = o'o |0) + oq 1 1 ) + ••• + TV) — y ] ot x \x), (A. 3) 

x=0 

where a x = u x + iv x , u x and v x are real numbers, and i = \f—\. 

The mathematicians’ preference for writing 0 instead of 10) for 
generic vectors is explicitly acknowledged in the useful convention 
that |cy 0 + /30) is nothing more than an alternative way of writing the 
vector o' |0) + f}\ 0): 


|a0 + /3(j)) = a |0) + P |0). (A.4) 

In a vector space over the complex numbers the inner product of 
two general vectors is a complex number satisfying 

(0|0) = (0|0)*, (A.5) 

where * denotes complex conjugation: 

(u + iv)* = u — iv, u, v real. (A.6) 

The inner product is linear in the right-hand vector, 

(0|of0i +iS02) = Of(0|0i) + j8(0|0 2 ), (A.7) 

and therefore, from (A.5), u anti-linear” in the left-hand vector, 

(Of 01 + >00210) = Of*(0i|0) + (0210). (A.8) 

The inner product of a vector with itself is a real number satisfying 

(0|0) >0, |0) #0. (A.9) 

It follows from the orthonormality condition (A.2) that the inner 
product of the vector 10) in (A.3) with another vector 

!</»>= A>|0) +AI1) + ••• + PMN) = J2^-\ x ) 


X 


(A. 10) 
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is given in terms of the expansion coefficients ot x and /3 X (called ampli¬ 
tudes in quantum computation) by 

(<t>W = £>>*• (A.11) 

The squared magnitude of a vector is its inner product with itself, so 
(A. 11) gives for the squared magnitude 

<W> = X>| 2 , (A.12) 

where 

\u + iv\ 2 = u 2 + v 2 , u , v real. (A.13) 

The form (A. 12) gives an explicit confirmation of the rule (A.9). 

A linear transformation A associates with every vector |0) another 
vector, called A| 0), subject to the rule (linearity) 

A(a\1r)+p\<p))=aA\\lr)+pA\<l>). (A.14) 

With a nod to the mathematicians, it is notationally useful to define 

|A 0) = A|0). (A. 15) 

A linear transformation that preserves the magnitudes of all vectors 
is called unitary , because it follows from linearity that all magnitudes 
will be preserved if and only if unit vectors (vectors of magnitude 
1) are taken into unit vectors. It also follows from linearity that if a 
linear transformation U is unitary then it must preserve not only the 
inner products of arbitrary vectors with themselves, but also the inner 
products of arbitrary pairs of vectors. This follows straightforwardly 
for two general vectors |0) and \i/f) from the fact that U preserves the 
magnitudes of both of them, as well as the magnitudes of the vectors 
10) + |0) and |0) + z'|0). 

One can associate with any given vector 10) the linear functional 
that takes every vector |0) into the number (0|0). Linearity follows 
from property (A.7) of the inner product. The set of all such linear 
functionals is itself a vector space, called the dual space of the original 
space. The functional associated with the vector a\ 0) + /3|0) is the 
sum of a* times the functional associated with |0) and /3* times the 
functional associated with |0). It is an easy exercise to show that any 
linear functional on the original space is associated with some vector in 
the dual space. Dirac called vectors in the original space ket vectors and 
vectors in the dual space bra vectors. He denoted the bra associated with 
the ket |0) by the symbol (01, so that the symbol (0|0) can equally 
well be viewed as the inner product of the two kets |0) and 10) or as a 
compact way of expressing the action (0|(|0)) of the associated linear 
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functional (01 on the vector |0). Note that one has 

(a0 + £0| = cx*<0 |+j8*<0|. (A.16) 

A linear transformation A on the space of ket vectors induces a linear 
transformation (called “A-adjoint”) on the dual space of bra vectors, 
according to the rule 

(AVr| = (f|A f . (A.17) 

The operation adjoint to the trivial linear transformation that multiplies 
by a given complex number is multiplication by the complex conjugate 
of that number. 

It is convenient to extend the dagger notation to the vectors them¬ 
selves, defining 

(\f)) ] = (t\, (A. 18) 

so that the bra dual to a given ket is viewed as adjoint to that ket. The 
definition (A. 17) of A^ then becomes 

(|AV0) f = MAt. (A. 19) 

or, with (A. 15), 

(A|V0) f = MAt. (A.20) 

which provides a simple example of a very general rule that the adjoint 
of a product of quantities is the product of their adjoints taken in the 
opposite order. Another instance of the rule which follows from (A.20) 
is that 

(<p |(AB) f = (AB0| = (B</>|A f = {^>|B t A t . (A.21) 

Since this holds for arbitrary (01 we have 

(AB) 1 = B+A 1 . (A.22) 

Although the adjoint A^ of a linear transformation A on kets is a 
linear transformation on bras, one can also define its action on kets. 
One does so by requiring that the action of (01 on A^|0) should be 
equal to the action of (01A^ on 10). This amounts to stipulating that the 
symbol (0|A^|0) should be unambiguous; it does not matter whether 
it is read as ((0|A^)|0) or as (0|(A’*’|0)). Implicit in this definition is 
the fact that a vector is completely defined by giving its inner product 
with all vectors. This in turn follows from the fact that a vector 10) 
can be defined by giving all the amplitudes a x in its expansion (A.3) in 
the complete orthonormal set \x). But a x = {x\\//). Similarly, a linear 
operator A is completely defined by giving its matrix elements (01A10) 
for arbitrary pairs of vectors, since the subset (x | A|j/) is already enough 
to determine its action on a general vector (A.3). 
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Note that any matrix element of A ^ is equal to the complex conjugate 
of the transposed (with 0 and 0 exchanged) matrix element of A: 

(4>\^W) = MW) = WM)* = WW\4>)*- (A.23) 

It follows from this that 

(A t ) t = A. (A.24) 

Since a unitary transformation U preserves inner products, we have 

(<PW) = WIUVO = (^lutuivo, (A.25) 

and therefore 

U f U = l, (A.26) 

where 1 is the unit (identity) operator that takes every vector into itself. 
It follows from (A.26) that 


UU f U = U. (A.27) 

In a finite-dimensional vector space a unitary transformation U always 
takes an orthonormal basis into another orthonormal basis, so any U 
clearly has a right inverse - the linear transformation that takes the 
second basis back into the first. Multiplying (A.27) on the right by that 
inverse tells us that 


UU f = 1, (A.28) 

SO U f and U are inverses regardless of the order in which they act. 

The vector | 0) is an eigenvector of the linear operator A if the action 
of A on |0) is simply to multiply it by a complex number a, called an 

eigenvalue of A: 


A|0) = a |0). (A.29) 

Since the numbers can be expressed as a = (0|A|0)/(0|0), it follows 
from (A.23) that if A = A^ (such operators are said to be self-adjoint or 
Hermitian) then a is a real number. Eigenvalues of Hermitian operators 
are necessarily real. 

Since A is Hermitian and a is a real number, it follows from (A.29) 
(by forming the adjoints of both sides) that 

(0|A = a(0|, (A.30) 

so the vector dual to an eigenket of a Hermitian operator is an eigenbra 
with the same eigenvalue. It follows immediately that if |0) is another 
eigenvector of A with eigenvalue a\ then 


0(010) = (0|A|0) = 0'(0|0), 


(A.31) 
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so if a' a then (0 |0) = 0: eigenvectors of a Hermitian operator with 
different eigenvalues are orthogonal. 

It can be shown that for any Hermitian operator A, one can choose an 
orthonormal basis for the entire D-dimensional space whose members 
are eigenvectors of A. The basis is unique if and only if all the D 
eigenvalues of A are distinct. In the contrary case (in which A is said 
to be degenerate) one can pick arbitrary orthonormal bases within 
each of the subspaces spanned by eigenvectors of A with the same 
eigenvalue. More generally, if A, B, C, .. . are mutually commuting 
Hermitian operators then one can choose an orthonormal basis whose 
members are eigenstates of every one of them. 

If B is any linear operator, then Ai = B + B^ and A2 = i (B^ — B) 
are both Hermitian, and commute if B and B^ commute. Since a joint 
eigenvector of Ai and A2 is also a joint eigenvector of B = Ai + i A2 
and B^ = Ai — i A2, it follows that if B commutes with B^ then one can 
choose an orthonormal basis of eigenvectors of B. In particular, since 
a unitary transformation U satisfies UU^ = U^U = 1, one can choose 
an orthonormal basis consisting of eigenvectors of U. Since unitary 
transformations preserve the magnitudes of vectors, the eigenvalues 
of U must be complex numbers of modulus 1. In the quantum theory 
such complex numbers are often called phase factors. 

Given two vector spaces of dimensions D\ and D2, and given any 
two vectors |0i) and |^ 2 ) in the two spaces, one associates with each 
such pair a tensor product \ 0i) ® | 02 ) (often the tensor-product sign 
is omitted) which is bilinear: 


I0i} ® (a| fi) + fi |0 2 )) = a\if\) <g> \x[f 2 ) + fi\f\) ® l</> 2 }, 

(a|i0i) + fi\<t>\)) ® \ fi) = a|0i) ® |02> + fi 101) ® I02>- 


(A.32) 


With the further rule that |0i) ® \ fi) = |0i) ® |02) only if |</>i) and 
|02) are scalar multiples of \\j/\) and |02), one easily sees that the set of 
all tensor products of vectors from the two spaces forms a vector space 
of dimension D\Di. 

One defines the inner product of |0i) ® 1 1 /^ 2 ) with |0i) ® |02) to be 
the ordinary product (0i |0i) (02102) of the inner products in the two 
original spaces. Given orthonormal bases for each of the two spaces, 
the set of tensor products of all pairs of vectors from the two bases 
forms an orthonormal basis for the tensor-product space. If Ai and A 2 
are linear operators on the two spaces, one defines the tensor-product 
operator Ai ® A 2 to satisfy 


(Ai ® A 2 )(|0 -i> ® IV f 2)) = |Al0'l> ® |A 2 02> = (Al|0l)) ® (A 2 |0- 2 >), 

(A.33) 

and easily shows that it can be extended to a linear operator on the 
entire tensor-product space. 
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All of this generalizes in the obvious way to n -fold tensor products 
of n vector spaces. 

If A is a linear operator whose eigenvectors constitute an orthonor¬ 
mal basis - i.e. if A is Hermitian or, more generally, if A and A^ com¬ 
mute - and if f is a function taking complex numbers to complex 
numbers, then one can define /(A) by specifying that each eigenvector 
|0) of A, in the basis with eigenvalue a, is also an eigenvector of /(A) 
with eigenvalue f{a). This defines /(A) on a basis, and it can therefore 
be extended to arbitrary vectors by requiring it to be linear. It follows 
from this definition that if f(z) is a polynomial or convergent power 
series in 2 then /(A) is the corresponding polynomial or convergent 
power series in A. 

In Dirac notation one defines the outer product of two vectors 10) 
and |0) to be the linear operator, denoted by |0)(0|, that takes any 
vector | y) into |0) multiplied by the inner product (0| y): 

{\4>)(f\)\y) = I <t>){(f\Y))- (A.34) 

As is always the case with Dirac notation, the point is to define things 
in such a way that the evaluation of an ambiguous expression such as 
|0) (01 y) does not depend on how you read it; the notation is designed 
always to enforce the associative law. 

Note that 10) (01 is the projection operator onto the one¬ 
dimensional subspace spanned by the unit vector 10); i.e. any vector 
| y) can be written as the sum of a vector |y)|| in the one-dimensional 
subspace and a vector \y)±_ perpendicular to the one-dimensional sub¬ 
space, and 


(WW)\y) = \y)w. (A.35) 

Similarly, if one has a set of orthonormal vectors |0*) then 
JT 10/) (0*1 projects onto the subspace spanned by all the 100. If 
the orthonormal set is a complete orthonormal set - for example 
\x), x = 0, N — then the set spans the entire vector space and 
the projection operator is the unit operator 1: 

N 

\x) (x | = 1. (A.36) 

x=0 

This trivial identity can be surprisingly helpful. Any vector 10), for 
example, satisfies 


W = 1W = I>>W>, (A.37) 

which tells us that the amplitudes ot x appearing in the expansion (A.3) 
of 10) are just the inner products (x | 0). Similarly, any linear operator 
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A satisfies 


A) = 1A1 



^2 k)(jl(('* , |A|j/)), 


(A.38) 

which reveals the matrix elements (v|A|j/) to be the expansion co¬ 
efficients of the operator A in the “operator basis” \x)(y\. And note 


that 


(x|AB|j/) = <x|A1B|j/> = ^(x|A|z)(z|B|j/), (A.39) 

2 


which gives the familiar matrix-multiplication rule for constructing 
the matrix of a product out of the matrix elements of the individual 
operators. 

If you prefer to think of vectors in terms of their components in a 
specific basis, then you might note that the (ket) vector with the 
expansion (A.3) with amplitudes a x in the orthonormal basis |v), can 
be represented by a column vector: 



/ \ 

a i 

\0£n / 


(A.40) 


The associated bra vector is then the row vector: 


If 


w —► («o «r 



(A.41) 


/ A.\ 

Pi 

\Pn/ 


(A.42) 


then the inner product {(j)\^f) is given by the ordinary matrix product 
of the row and column vectors: 





(A.43) 


The outer product \^r) (<j>\ is also a matrix product: 


m(4>\ = 




(A.44) 
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Note that in Dirac notation (A.43) is nothing more than the state¬ 
ment that 

(M) = (4>W) = \<t>)*(xm, (A.45) 


X 


X 


while (A.44) asserts that 


(■*l(lV f X0l)lj) = (x\f)(<t>\y) = (x\f) (y\<j))*. (A.46) 




Appendix B 

Structure of the general 1-Qbit 
unitary transformation 


I describe here some relations among Pauli matrices, 1-Qbit unitary 
transformations, and rotations of real-space three-dimensional vectors. 
The relations are of fundamental importance in many applications of 
quantum mechanics, and are an essential part of the intellectual equip¬ 
ment of anybody wanting to understand the mathematical structure of 
three-dimensional rotations. The reason for mentioning them here is 
that they can also make certain circuit identities quite transparent. The 
quantum-computation literature contains some unnecessarily cumber¬ 
some derivations of many such identities, suggesting that these useful 
mathematical facts deserve to be more widely known in the field. 

The two-dimensional unit matrix 1 and the three Pauli matrices 
form a basis, 



for the four-dimensional algebra of two-dimensional matrices: any two- 
dimensional matrix u has a unique expansion of the form 


u = u 0 l + it • it (B.2) 

for some complex number uo and 3-vector it with complex compo¬ 
nents u x ,u y , and u z . Here it represents the “3-vector” whose com¬ 
ponents are the Pauli matrices a *, <r y , and cr 2 , so in expanded form 
(B.2) reads 

/ uo -\- u z u x — iiiy 

U — U () 1 U X (T X UyCTy u z a z — I . 

\u x + lUy Uo — U z 

(B.3) 

As what follows demonstrates, however, it is invariably simpler to use 
the form (B.2) together with the multiplication rule (see Section 1.4) 

(~t ■ ~&)(t ■ = (~t ■ T)1 + i(~t X T) • (B.4) 

rather than dealing explicitly with two-dimensional matrices. 

Impose on (B.2) the condition 

uu f = u f u = 1 (B.5) 
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that u be unitary. Since any unitary matrix remains unitary if it is 
multiplied by an overall multiplicative phase factor e l ° with 0 real, we 
can require mo to be real and arrive at a form which is general except for 
such an overall phase factor. Since the Pauli matrices are Hermitian, 
we then have 


u t = «o l + (B.6) 

The rule (B.4) now tells us that for u to be unitary we must have 

0 = l-u t u = (l-M^-lf* • lt)l-(u 0 (lt + lt*)+ilt* 

(B.7) 

Since 1, cr r , <x y , and a z are linearly independent in the four¬ 
dimensional algebra of 1-Qbit operators, the coefficients of all four 
of them in (B.7) must vanish and we have 

1 = Mq + it* ■ it, 0 = uo(lt + lt*) + ilt* X it. (B.8) 

The second of these requires the real and imaginary parts of the 
vector it to satisfy 


MoRelf = R tit x Im it. (B.9) 

If Mo ^ 0, it follows from (B.9) that Re if • Re if = 0, so Re if = 0, 
and the vector if must be i times a real vector if. On the other hand 
if uo = 0 then (B.9) requires the real and imaginary parts of if to 
be parallel vectors, so that if itself is just a complex multiple of a 
real vector. But if uq = 0 we retain the freedom to pick the overall 
phase of the operator u, which we can choose to make the vector if 
purely imaginary. So irrespective of whether or not mo = 0, the general 
form for a two-dimensional unitary u is, to within an overall phase 
factor, 


u = Mol + /if • if, (B.10) 

where mo is a real number, if is a real vector, and, from the first of 
(B.8), 

Mq T if • if = 1. (B.ll) 

The identity (B.ll) allows us to parametrize mo and if in terms of 
a real unit vector if parallel to if and a real angle y so that 

u = cos y 1 + i sin y( it • it). (B.12) 

An alternative way of writing (B.12) is 

u = exp(/ylf • it). (B.13) 

This follows from the forms of the power-series expansions of the 
exponential, sine, and cosine, together with the fact that (if • it ) 2 = 1 
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for any unit vector it as a special case of (B.4). (The argument is the 
same as the argument that e l(p = coscp + i sirup for any real number 

<P-) 

A remarkable connection between these two-dimensional unitary 
matrices and ordinary three-dimensional rotations emerges from the 
fact that each of the three Pauli matrices in (B.l) has zero trace, and 
that the operator unitary transformation 

A —>► uAu f (B.14) 

preserves the trace of A. 1 

Note first that if it is a real vector then u(lt • 7^)iV is Hermitian 
and can therefore be expressed as a linear combination of 1 and the 
three Pauli matrices with real coefficients. Since <r v , <x )7 , and a z all 
have zero trace, so does it • it and therefore so does u (it • ~^)iA 
Its expansion as a linear combination of 1 and the three Pauli matrices 
must therefore be of the form it' • it for some real vector it' (since 
1 alone among the four matrices has nonzero trace): 

u{-£ • ^)u f = (B.15) 

It follows that 

u (It ■a)(t ■ = (u(^ • oV)(u(T • = (yt' ■ ^VT' • 

(B.16) 

Since unitary transformations preserve the trace, 

Tr (~t ■ ~&)(t ■ ~&) = Tr {it' ■ ■ ~&). (B.17) 

Hence, from (B.4), 

It' .~t = ~£.~fr. (B.18) 

It follows directly from the form (B.15) of the transformation from 

unprimed to primed vectors that (it + ’)' = -£' + T' an d(xity = 

Xlt - i.e. the transformation If If is linear. But the most general 
real, linear, inner-product-preserving transformation on real 3-vectors 
is a rotation. Consequently the transformation from real 3-vectors it 
to real 3-vectors it' induced by any two-dimensional unitary u through 
(B.15) is a rotation: 



(B.19) 


Furthermore, by applying the (unitary) product uv of two unitary 


1 The trace of a matrix is the sum of its diagonal elements. Recall also the 
(easily verified) fact that the trace of a product of two matrices is 
independent of the order in which the matrices are multiplied, even when 
the matrices do not commute. 
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transformations in two steps, 

(uv)f^ • ^Xuv) 1 = u(v(~^ • = u([R v ~^] • ~ct)u ] 

= [R u Rv“^] • (B.20) 

we deduce that 

Ruv — RuRv (B.21) 

Thus the association of three-dimensional rotations with two- 
dimensional unitary matrices preserves the multiplicative structure 
of the rotation group: the rotation associated with the product of two 
unitary transformations is the product of the two associated rotations. 

Which rotation is associated with which unitary transformation? To 
answer this, note first that when the vector it in (B. 15) is taken to be 
the vector it appearing in u (in (B.12) or (B.13)) then it' — it, since 
u then commutes with it • 7?. Therefore it is along the axis of the 
rotation associated with u = exp (iylt • it). To determine the angle 
9 of that rotation, let it be any unit vector perpendicular to the axis 
it , so that 

cos 0=nt-nt'. (B.22) 

We then have 

cos 9 = \ Tr ((it • ~&){lt' • ~$)) 

= 1 Tr ((it • it )(cos y 1 + i sin y ~t • it ){lt • cr) 
x (cos y 1 — i sin y it • *)) 

= j Tr((cos y it — smy it x it) • it) 

x (cos y it + sin y it x it ) • *)) 

= cos 2 ]/ — sin 2 ]/ = cos(2 y), (B.23) 

where we have made repeated use of(B.4) and the fact that it • it = 0. 

So the unitary matrix (B.13) is associated with a rotation about the 
axis it through the angle 2 y. Since the identity rotation is associ¬ 
ated both with u = 1 and with u = — 1, the correspondence between 
these unitary matrices and three-dimensional proper rotations is 2-to- 
1. It is useful to introduce the notation u(lt , 9) for the 1-Qbit unitary 
transformation associated with the rotation R (it, 9) about the axis it 
through the angle 9 : 

u (It, 9) = exp ■ ~&) = cos (±6>) + i{lt ■ o*>inQ6>). (B.24) 

The three-dimensional rotations arrived at in this way are all^r^^r 
(i.e. they preserve rather than invert handedness) because they can all 
be continuously connected to the identity. Any proper rotation can be 
associated with a u, and in just two different ways (u and — u clearly 
being associated with the same rotation). The choice of phase leading 
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to the general form (B.10) with real uq can be imposed by requiring 
that the determinant of u must be 1, so in mathematical language we 
have a 2-to-l homomorphism from the group SU(2) of unimodular 
unitary two-dimensional matrices to the group SO(3) of proper three- 
dimensional rotations. 

Although this may all seem tediously abstract, it is surprisingly 
useful at a very practical level. It can reduce some highly nontriv¬ 
ial three-dimensional geometry to elementary algebra, just as Euler’s 
relation e 1 ^ = cos 0 + i sin</> reduces some slightly nontrivial two- 
dimensional trigonometry to simple algebra. Suppose, for example, 
that you combine a rotation through an angle a about an axis given 
by the unit vector ~ct with a rotation through /3 about b . The re¬ 
sult, of course, is a single rotation. What are its angle y and axis "7^? 
Answering this question can be a nasty exercise in three-dimensional 
geometry. But to answer it using the Pauli matrices you need only note 
that u(~t , y) = u(!t, Q')u( b , /?), i.e. 

cosQy)l+i sin(|y)("? • ~&) = (cosQa)l+i sin(yQ')(“^ -7?)) 

x(cosQ/3)l-H sin(j/3)(l^-7^)). 

(B.25) 

Now multiply out the right side of (B.25), using (B.4). To get the angle 
y take the trace of both sides (or identify the coefficients of 1) to find 

cosQy) = cos (\oi) cos (\fi) — (7f • )sinQa)sinQ/3). (B.26) 

To get the axis , identify the vectors of coefficients of the Pauli 
matrices: 

sin(iy)"? = sinQ/^cosQa)"? + sinQa)cosQ/3)"^ 

— sinQa)sinQ/3)(7f x ~t). (B.27) 

Note that (B.26) and (B.27) are trivially correct when it and ~t are 
parallel. A little geometrical thought reveals that they are also correct 
when a and are both 180°. To try to see geometrically why they 
are correct more generally is to acquire a deep appreciation for the 
remarkable power of the representation of three-dimensional rotations 
in terms of two-dimensional unitary transformations. Other examples 
of the power of the representation are illustrated in the derivations of 
circuit identities in Section 2.6, in the characterization of the general 
1-Qbit state in Appendix C, and in the construction of the Hardy state 
in Appendix D. 



Appendix C 

Structure of the general 1-Qbit state 


The 1-Qbit computational-basis states |0) and 11) can be characterized 
to within an overall phase by the fact that they are eigenstates of the 
number operator n with eigenvalues 0 and 1 or, equivalently, that they 
are eigenstates of 1 — 2n = Z = -7? with eigenvalues 1 and —1. 

Let |0> be any 1-Qbit state, and let |0) be the orthogonal state 
(unique to within an overall phase), satisfying (0|0) = 0. Since |0) 
and 11) are linearly independent there is a unique linear transformation 
taking them into |0) and |0). But since 10) and |0) are an orthonormal 
pair (as are |0) and |1)) this linear transformation preserves the inner 
product of arbitrary pairs of states, so it is a unitary transformation u. 

Since 


10) — u|0), 10} = u|l>, (C.l) 

the operator n' = unu^ acts as a Qbit number operator on 10) and |0): 

n'|0)=O, n'|0> = |0>. (C.2) 

Since, as shown in Appendix B, any 1-Qbit unitary transformation u 
is associated with a rotation R(7/t\ 6), we have 

n' = unu^ = ^(l — u(^ • aQu f ) = ^(1 — it' • 7^), (C.3) 

where it' = R(lt , 6)lt. 

Thus n' , which functions as a number operator for the states |0) = 
u(7^\ 0)|O) and |0) = u(lt , 0)|1), is constructed out of the compo¬ 
nent of the vector of operators a along the direction ~1t = R(lt , 0)~t 
in exactly the same way that n, the number operator for the compu¬ 
tational basis states |0) and |1), is constructed out of the component 
along ~1t . This suggests that there might be nothing special about the 
choice of |0) and 11) to form the computational-basis states for each 
Qbit - that any pair of orthogonal states, |0') = u 10) and |L) = u 11), 
could serve equally well. Furthermore, it is at least a consistent pos¬ 
sibility that to make an apparatus to measure the Qbits in this new 
basis we need do nothing more than apply the rotation R associated 
with u to the apparatus that served to measure them in the original 
basis. 

This physical possibility is realized by some, but by no means all, of 
the physical systems that have been proposed as possible embodiments 
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of Qbits. It is realized for certain atomic magnets - also called spins - 
which have the property that when the magnetization of such a spin is 
measured along any given direction, after the measurement the magnet 
is either maximally aligned along that direction or maximally aligned 
opposite to that direction. These two possible outcomes for a particular 
direction - conventionally taken to be ~1t - are associated with the 
values 0 and 1 for the Qbit. After such a measurement the spin is left in 
the state |0) or |1). Any other state \<p) and its orthogonal partner | \js) 
specify an alternative direction, along which the magnetization might 
have been measured, associated with an alternative scheme for reading 
out values for the Qbits. 

For this example the continuum of possible states available to a Qbit, 
compared with the pair of states available to a Chit, reflects the contin¬ 
uum of ways in which one can read a Qbit (measuring its magnetization 
along any direction) as opposed to the single option available for read¬ 
ing a Chit (finding out what value it actually has). For Qbits that are not 
spins, the richness lies in the possibility of applying an arbitrary unitary 
transformation to each Qbit, before measuring it in the computational 
basis. What makes spins special is that applying the unitary transfor¬ 
mation to the Qbits (which is not always that easy to arrange) can be 
replaced by straightforwardly applying the corresponding rotation to 
every 1-Qbit measurement gate. 



Appendix D 

Spooky action at a distance 


As a further exercise in applying the quantum-computational formal¬ 
ism to Qbits, and as a subject of interest in itself, though not directly 
related to quantum computation, I describe here a thought-provoking 
state of affairs illustrated with an example discovered by Lucien Hardy. 
(Similar thoughts are provoked by an example discovered by Daniel 
Greenberger, Michael Horne, and Anton Zeilinger, described in Sec¬ 
tion 6.6.) 

Suppose that Alice and Bob each has one member of a pair of Qbits, 
which have been prepared in the 2-Qbit state 

!<*>> = A( 3|00> + |01> +110> _ |n) )- (EU ) 

A specification of how to prepare two Qbits in such a Hardy state, 
somewhat more transparent than the general procedure described in 
Section 1.11, is given after the extraordinary properties of the Hardy 
state are described. One easily verifies that the state |0) can also be 
written as 


|*> = A 

where we take to act on the left (Alice’s) Qbit and H/, to act on the 
right (Bob’s) Qbit. Note the following four elementary properties of a 
pair of Qbits in the state |0). 

(i) If Alice and Bob each measures their own Qbit, then (D.l) shows 
that there is a nonzero probability (^) that both get the result 1. 

(ii) If Alice and Bob each applies a Hadamard to their own Qbit then, 
since H = 1, the state (D.2) of the Qbits becomes 

| 10 >), 

(D.3) 

so if they measure their Qbits after each has applied a Hadamard, then 
the probability that both get the value 1 is zero. 

(iii) If only Alice applies a Hadamard to her Qbit, then the state 
(D.2) of the two Qbits becomes 

»‘W = T3 

Since |00) is a linear combination of 100) and 110), and since 111) 
is a linear combination of 110) and 111), the state |01) does not appear 


(2HJ00>-Hi|ll». (D.4) 


H a H/,|<t>) = -^(2H,H^|00) - |11)) = ^(|00> + |01> + 


(2|00> -H^Hilll)), (D.2) 
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Table D.1 . Four ways to measure two Qbits in 
the Hardy state (D.l) 


Gates 

Result 

Possible? 

Alice 

Bob 

Alice 

Bob 

i 

i 

1 

1 

Yes 

i 

H 

1 

0 

No 

H 

i 

0 

1 

No 

H 

H 

1 

1 

No 


in the expansion of H JO) in computational-basis states. So when the 
Qbits are subsequently measured the probability is zero that Alice will 
get the value 0 and Bob the value 1. 

(iv) If only Bob applies a Hadamard to his Qbit, then by the same 
reasoning (except for the interchange of Alice and Bob) when the Qbits 
are subsequently measured the probability is zero that Alice will get 
the value 1 and Bob the value 0. 

Taken together, these four cases seem to have some very strange 
implications. The cases are summarized in the four rows of Table D. 1 
above. On the left is indicated whether (H) or not (1) Alice or Bob 
sends their Qbit through a Hadamard gate before sending it through a 
measurement gate. In the center is listed the measurement outcome of 
interest for each case. The column on the right specifies whether that 
outcome can or cannot occur for that particular case. 

To see what is strange, suppose that Alice and Bob each indepen¬ 
dently decides, by tossing coins, whether or not to apply a Hadamard 
to their Qbit before sending it through a measurement gate. There is 
a nonzero probability x ^ that neither applies a Hadamard 

and both measurement gates show 1 (see the first row of Table D.l). 
In the one time in 48 that this happens, it is tempting to conclude that 
each Qbit was, even before the coins were tossed, capable of producing 
a 1 when directly subjected to a measurement gate because, after all, 
each Qbit did produce a 1 when directly subjected to a measurement 
gate. 

But if Alice’s Qbit did indeed have such a capability, then, in the 
absence of spooky interactions between Bob’s Hadamard and Alice’s 
Qbit, her Qbit surely would have retained that capability, even if Bob’s 
coin had come up the other way and he had applied a Hadamard to 
his own Qbit before measuring it. But if Alice’s Qbit was indeed ca¬ 
pable of registering a 1 when measured directly, then Bob’s Qbit must 
have been incapable of registering a 0 if measured after a Hadamard, 
since (see the second row of Table D.l) when Bob applies a Hadamard 
before his measurement and Alice does not, it is impossible for Bob’s 
measurement to give 0 while Alice’s gives 1. 
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By the same reasoning (interchanging Alice and Bob and referring 
to the third row of Table D.l) we conclude that Alice’s Qbit must also 
have been incapable of registering a 0 when measured after a Hadamard. 

So in each of the slightly more than 2% of the cases in which neither 
Alice nor Bob applies Hadamards and both their measurement gates 
register 1, we conclude that if the tosses of both coins had come out 
the other way and both had applied Hadamards before measuring, then 
neither Qbit could have registered 0 when measured: both would have 
had to register 1. But according to the fourth row of Table D.l this can 
never happen. 

Although this particular argument was discovered by Lucien Hardy 
only in the early 1990s, similar situations (where the paradox is not 
as directly evident) have been known since a famous paper by John 
Bell appeared in 1964. Over the years passions have run high on the 
significance of this. Some claim that it shows that the value Alice or 
Bob finds upon measuring her or his Qbit does depend on whether or 
not the other, who, with his or her Qbit, could be far away, does or 
does not apply a Hadamard to his or her own Qbit before measuring it. 
They call this “quantum nonlocality” or “spooky action at a distance”- 
a translation of Einstein’s disparaging spukhafte Fernwirkungen. 

My own take on it is rather different. With any given pair of Qbits, 
Alice and Bob each either does or does not apply a Hadamard prior 
to their measurement. Only one of the four possible cases is actually 
realized. The other three cases do not happen. In a deterministic world 
it can make sense to talk about what would have happened if things 
had been other than the way they actually were, since the hypothetical 
situation can entail unique subsequent behavior. But in the intrinsically 
nondeterminstic case of measuring Qbits, one cannot infer, from what 
Alice’s Qbit actually did, that it has a “capability” to do what it actually 
did, which it retains even in a hypothetical situation that did not, in 
fact, take place. To characterize the possible behavior of Alice’s Qbit 
in a fictitious world requires more than just the irrelevance of Bob’s 
decision whether or not to apply a Hadamard. It also requires that 
whatever it is that actually is relevant to Alice’s outcome remains the 
same in both worlds and plays the same role in bringing about that 
outcome. But the reading of a measurement gate has an irreducible 
randomness to it: nothing need play a role in bringing it about. 1 

The real lesson here is that if one has a single pair of Qbits and 
various choices of gates to apply to them before sending them through 
a measurement gate, then it makes no sense to infer, from the actual 


1 Conscience requires me to report here the existence of a small deviant 
subculture of physicists, known as Bohmians, who maintain that there is a 
deterministic substructure, unfortunately inaccessible to us, that underlies 
quantum phenomena. Needless to say, all Bohmians believe in real 
instantaneous action at a distance. 
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outcome of the measurement for the actual choice of gates, additional 
constraints, going beyond those implied by the initial state of the Qbits, 
on the hypothetical outcomes of measurements in the fictional case in 
which one made a different choice of gates. It is nonsense to insist that 
Alice’s Qbit has to retain the “capability” to do what it actually did, if 
we imagine turning back the clock and doing it over again. Assigning 
a “capability” to Alice’s Qbit prior to the measurement is rather like 
assigning it a state. But the pre-measurement state (D. 1) is an entangled 
state, so Alice’s Qbit has no state of its own. 

One can, however, let Alice and Bob repeatedly play this game with 
many different pairs of Qbits, always preparing the Qbits in the same 
initial 2-Qbit state (D.l). It is then entirely sensible to ask whether the 
statistics of the values Bob finds upon measuring his Qbit depend on 
whether Alice applied a Hadamard transform to her Qbit. For Alice and 
Bob can accumulate a mass of data, and directly compare the statistics 
Bob got when Alice applied the Hadamard with those he got when 
she did not. If Bob got a different statistical distribution of readings 
depending on whether Alice did or did not apply a Hadamard to her 
faraway Qbit before she measured it, this would permit nons\)ooky 
action at a distance which could actually be used to send messages. So 
it is important to note that Bob’s statistics do not, in fact, depend on 
whether or not Alice applies a Hadamard. 

We can show this under quite general conditions. Suppose that n 
Qbits are divided into two subsets, each of which may be indepen¬ 
dently manipulated (i.e. subjected to unitary transformations) prior to 
a measurement. Let the n a Qbits on the left constitute one such group 
and the ny = n — n a on the right, the other. Think of the first group 
as under the control of Alice and the second as belonging to Bob. If 
the n Qbits are always prepared in the state |\k), then if Alice and Bob 
separately measure their Qbits, the Born rule tells us that the joint 
probability p(x, y) of Alice getting x and Bob y is 

p^x,y)={^\P a x P l ;m, (D.5) 

where the projection operator P a acts only on Alice’s Qbits (i.e. it acts 
as the identity on Bob’s) and P /; only on Bob’s. 

Suppose, now, that Alice acts on her Qbits with the unitary transfor¬ 
mation U a before making her measurement and Bob acts on his with 
U/,. Then the state |\k) is changed into 

|0) = U fl U,|*>. (D.6) 

Now the probability of their measurements giving x and j/, conditioned 

on their choices of unitary transformation, is 

py{x,y\\i a , U/,) = (<&|p;pji<&) = ('p|ujut(p;pyu a u i |*) 

= (^l(utp;u a )(ulpju i )i^) (D.7) 
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(where we have used the fact that all operators that act only on Alice’s 
Qbits commute with all operators that act only on Bob’s). 

It follows from the fact that 

= ut(£p;)u a = ulw a = 1 (D.8) 

that Bob’s marginal statistics do not depend on what Alice chose to do 
to her own Qbits: 

^Cy|u B Uj)=^^(*,^|u B ,u*)=('P|(uJp*u*)|'P>=M(y|u 4 ), 

(D - 9) 

which does not depend on the particular unitary transformation U a 
chosen by Alice. Therefore the statistics of the measurement outcomes 
for any group of Qbits are not altered by anything done to other Qbits 
(provided, of course, that the other Qbits do not subsequently interact 
with those in the original group, for example by the application of 
appropriate 2-Qbit gates). 

Like any 2-Qbit state, the state (D.l) leading to this remarkable set 
of data can be constructed with a single cNOT gate and three 1-Qbit 
unitary gates. Here is a construction that is somewhat more direct than 
the general construction given in Section 1.11. It exploits the connec¬ 
tion between 1-Qbit unitary transformations and three-dimensional 
rotations developed in Appendix B. 

It follows from (D.3) that 


l‘*>> = H B H i ^(|00> + |01> + |10>) 

yiH,ioo>+yiiio) 

y§|00) + /±H*|10> 

yfio>+yiii})io> 


H 


a 


_ |_i rH 


= H b C"w b |00>, (D.10) 


where w is any 1-Qbit unitary transformation that takes |0) into 
*//||0) + .V|-|l), and C H is a 2-Qbit controlled-Hadamard gate: 


cfol x y) = H ol x y)- 


(Dll) 


To construct a controlled-Hadamard C H from a controlled-NOT C, 
note that the NOT operation X is x • o while the Hadamard transfor¬ 
mation is H = (1/V2)(X + Z) = (1/a/2)(x + z) • o. It follows from 
the discussion of 1-Qbit unitaries in Appendix B that 


H = uXu f , 


(D.12) 


where u is the 1-Qbit unitary associated with any rotation that takes x 
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into (1/V2)(x + z). Since we also have 1 = iW, it follows that 

= u 0 CuJ. (D.13) 

So (D.10) reduces to the compact form 

1$) = H a u*c a /,w a u}|00). 

If you want an explicit form for w, its matrix in the computational 
basis could be 




(D.14) 


To get an explicit form for u, note that a rotation through 7 t/ 4 about 
the j/-axis takes x into (1/x/2)(x + z). The associated unitary transfor¬ 
mation is 


u = exp (i(7t/8)(Ty) = cos(7t/8)1 + i sin(7r/8 )<r y . (D.15) 


Since the matrix for a y in the computational basis is 



the matrix for u is 


cos(7t/8) sin(7r/8) 

— sin(7r / 8) cos(7T / 8) 


Since the matrices for X and H are 

0 1 
1 0 


and 


1 / 1 


V2 



(D.16) 


you can easily confirm that these three matrices do indeed satisfy (D. 12). 
Verifying this should give you an appreciation for the power of the 
method described in Appendix B. 




Appendix E 

Consistency of the generalized 
Born rule 


A general state of m + n Qbits can be written as 


I ^)m-\-n — ^ ^ &xy I % )m Ijf) 


n 


x,y 


(E.l) 


The most general form of the Born rule asserts that if just the m 
Qbits associated with the states \x) m in (E.l) are measured, then with 
probability 


p(x)= 

y 



(E.2) 


the result will be a 1 , and after the measurement the state of all m + n 
Qbits will be the product state 


m 


n » 


(E.3) 


where the (correctly normalized) state of the n unmeasured Qbits is 
given by 




(E.4) 


This strongest form of the Born rule satisfies the reasonable con¬ 
sistency requirement that measuring r Qbits and then immediately 
measuring 5 more, before any other gates have had a chance to act, 
is equivalent to measuring all the r + s Qbits together. An important 
consequence is that an w-Qbit measurement gate can be constructed 
by applying n 1-Qbit measurement gates to the n individual Qbits, as 
illustrated in Figure 1.8. 

To establish this consistency condition, write the state of r + 5 + u 
Qbits as 


l^)« — ^ ^ x) r | z) u . 


(E.5) 


x,y,z 


If the r + 5 Qbits are all measured together then a direct application 
of the rule tells us that the result will be xy with probability 


p(xy)= 


a 


xyz I » 


(E.6) 


z 
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and that the post-measurement state of the Qbits will be 

1 


r\y) s\^ xy) u — 


'/pixy) 


2_j a xyz\ z )u- (E.7) 


Z 


On the other hand if just the first r Qbits are measured then the rule 
tells us that the result will be x with probability 


p(x) = 


a 


xyz I ? 


(E.8) 


y,z 


and that the post-measurement state will be 


1 V- 

— I %)r i , v / v ® / xyz\y)s \&)u • 

V P\ X ) y,z 


(E.9) 


Given that the result of the first measurement is v, so that the post¬ 
measurement state is (E.9), a further application of the rule tells us that 
if the next 5 Qbits are measured, the result will be y with probability 

(E.10) 


Piy\x) = y] Ol xyz /y/p{x) 


z 


and that the post-measurement state after the second measurement 
will be 


\%)r |j/)s I ^xy)u •> (E. 11) 

where 

|a> xy)u = 7WW) 7W) ? a ^ z|z> “' (E - 12) 

Since the joint probability of getting x and then getting y is related 
to the conditional probability p(y\x) by 

Pixy) = p(x)p{y\x), (E.13) 

this final state and probability are exactly the same as the probability 
(E.6) and final state (E.7) associated with a direct measurement of all 
r + s Qbits. 














Appendix F 

Other aspects of Deutsch's problem 


Suppose that one attempted to solve Deutsch’s problem, not by the 
trick that does the job in Chapter 2, but by doing the standard thing: 
starting with input and output registers in the state |0) |0), applying a 
Hadamard to the input register, and then using the one application of 
U f to associate with the two Qbits the state 

IVO = ^|0>|/(0)> + ^|1>|/(1)>. (FI) 

A direct measurement of both Qbits reveals the value of f at either 0 
or 1 (randomly), but gives no information whatever about the question 
under investigation, whether or not /(0) = /(l). 

Is there anything further one can do to two Qbits in the state (F. 1) to 
learn whether or not /(0) = /(l) (without any further application of 
U/)? The answer is yes, there is. But it works only half the time. Here 
is one such procedure. 

For each of the four possibilities for the unknown function /, the 
corresponding forms for the state (F.l) are 


m = o, 

/(D = 0: 

o 

o 

75(10) + |1»|0>, 

(F.2) 

/(0) = 1. 

/(1)=1: 

\f)u = 

75(|0) + |1))U). 

(F.3) 

m = o, 

/(1)=1: 

1 lA >01 = 

75(10)10) + |1)|1>), 

(F.4) 

m = i, 

/(D = 0: 

1 VO io = 

75(10)11) + 11)10)). 

(F.5) 


We know that \ijf) has one of these four forms, and wish to distinguish 
between two cases: 

Case 1 : \f) = |^)oo or \x//)n; Case 2: \f) = |^)oi or |^)io- 

By applying Hadamards to both Qbits we change the four possible 
states to 


(H® H)|VOoo = -^(|0)|0) + |0>|l», 

(F.6) 

(H ® H) | V r ) h = ^(|0)|0) — |0>|1>), 

(F.7) 

(H®H)|^>oi = ^(|0)|0> + |1>|1>), 

(F.8) 

(H®H)|f)io = ^(|0>|0>-|l)|l}). 

(F.9) 
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Now measure both Qbits. If we have one of the Case-1 states, (F.6) 
or (F.7), we get 00 half the time and 01 half the time; and if we have 
one of the Case-2 states, (F.8) or (F.9), we get 00 half the time and 11 
half the time. So regardless of what the state is, half the time we get 00 
and learn nothing whatever, and half the time we get 01 or 11 and learn 
which case we are dealing with. 

This way of dealing with Deutsch’s problem - with a 50% chance 
of success - was noticed before the discovery of the 100%-effective 
method described in Chapter 2. One might wonder whether some more 
clever choice of operations on the state (F.l) could enable one always 
to make the discrimination. It is easy to show that this is impossible. 

We wish to apply some general 2-Qbit unitary transformation U to 
\\jr) with the result that every possible outcome of a subsequent mea¬ 
surement must rule out one or the other of the two cases. For this to 
be so it must be that those computational-basis states that appear in 
the expansions of the states U|VOoo and U|V r )n cannot appear in the 
computational-basis expansions of the states U|VOoi and U|V r )io 5 and 
vice versa, for otherwise there would be a nonzero probability of a mea¬ 
surement outcome that did not enable us to discriminate between the 
two cases. Consequently U|^}oo and U|^)n must each be orthogonal 
to each of and U|'i/ r )io- But this is impossible, because uni¬ 

tary transformations preserve inner products, while (F.2)-(F.5) show 
that the inner product of any Case-1 state |^) ?/ with any Case-2 
state is 

One can, in fact, show under very general circumstances that, start¬ 
ing with two Qbits in the state (F.l), one cannot do better than ap¬ 
plying Hadamards to both before measuring: there must be at least a 
50% chance that the measurement outcomes will not enable one to 
discriminate between Case 1 and Case 2. The proof that 50% is the 
best one can do provides an instructive illustration of many features of 
the quantum-mechanical formalism. 

Suppose that we bring in n additional ( ancillary ) Qbits to help us out. 
These might be used to process the input and output registers further 
through some elaborate quantum subroutine, producing an arbitrary 
unitary transformation W that acts on all n + 2 Qbits before a final 
measurement of the n + 2 Qbits is made. (This, of course, reduces to 
the simpler case of no ancillary Qbits, if W acts as the identity except 
on the original two Qbits, hereafter called the pair.) 

Let the ancillary Qbits start off in some state |x) w , which we can 
take to be |0)„. (Any other n-Qbit state is related to |0)„ by a unitary 
transformation in the ancillary subspace, which can be absorbed into 
W.) Let the pair be in one of the four states \i/f) given in (F.2)-(F.5). 
After W acts the probability of a measurement giving x (0 < x < 3) 
for the pair and y (0 < y < 2 n ) for the ancillary Qbits is 


P\f)(x,y) = \(x,y\\N\f,0}\ 2 , 


(F.10) 
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where it is convenient to write a (2 + w)-Qbit state of the form \ 

I x)n as \ifr, x). 

Note next that for arbitrary pair states 10) 

P | 0 ) (x, y) = 0 if and only if (x, j/|W|</>, 0) = 0, (F.ll) 

so if p\(f))(x,y) vanishes for several different states | 0 ), linearity re¬ 
quires it also to vanish for any state in the subspace they span. There¬ 
fore any measurement outcome that enables us to discriminate between 
Case 1 and Case 2 must have zero probability either for both of the states 
(F.2) and (F.3) and therefore for any state in the subspace they span, or 
for any state in the subspace spanned by the states (F.4) and (F.5). Now 
(F.2)-(F.5) reveal that the state 

|a) = |(|00) + |01) + |10} + |11}) (F.12) 

belongs to both of these subspaces. So if there are any measurement 
outcomes x, y with 


P\a){x, y) + 0, (F.13) 

then such outcomes are uninformative. Therefore the probability of a 
measurement outcome that fails to discriminate between Case 1 and 
Case 2 is at least 


/ 

^min — ^ ' P |i fs)(x, y)i (F. 14) 

x,y 

where the prime indicates that the sum is restricted to those measure¬ 
ment outcomes x, y that satisfy (F.13). 

Now it is easy to verify that every one of the four possible forms 
(F.2)-(F.5) for \\p) is of the form 

l*> = *(!“> + !*»• (E15) 

where |a) is given in (F.12) and \/3) is orthogonal to \a). Since \i/f) has 
the form (F.15), we have from (F.14) and (F.10) that 

Pmm = I T (i’!«>('*'> J / ) + 2 Re [(^>0|W t |x,j/}(x, J/|W|a, 0}] 

x,y 

+ p\f))(x, j)). (El6) 

Although the sum in (F.16) is restricted to those x, y satisfying (F.13), 
we can extend it in each of the first two terms to all x, y since this 
adds either zero probabilities (first term) or (because of (F.ll)) zero 
amplitudes (second term). The first term then gives 

T pw( x ’ y) = r ( E17 ) 

all x,y 
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while the second gives 

2 Re ^ (ft, 0|W f \x, y)(x, y\\N\a, 0) = 2 Re(jS, 0|W f W|a, 0} 

all x,y 

= 2Re(/3, 0|1 |cy, 0) =0, 

(F.18) 


since la) and \/3) are orthogonal. Hence 


P min — 2 ( ^ ^ 2* 


x,y 


One must fail at least half the time. 


(F.19) 



Appendix G 

The probability of success in 
Simon's problem 


Section 2.5 gives a rough argument that the number of runs necessary 
to determine the n -bit number a in Simon’s problem is of order n. 
Further analysis is needed to get a more accurate estimate of how many 
runs give a high probability of learning a . 

If we invoke U / m times, we learn m independently selected random 
numbers 3 /, whose bits y l satisfy 


n — 1 

a • y = ^2 / y t a l = 0 (mod 2). (G.l) 

z=0 

If we have n — 1 relations (G.l) for n — 1 linearly independent sets of 
yi , then this gives us enough equations to determine a unique nonzero 
a . “Linearly independent” in this context means linear independence 
over the integers modulo 2; i.e. no subsets of the j/s should satisfy y © 
y © y" © • • • = 0 (mod 2). We have to invoke the subroutine enough 
times to give us a high probability of coming up with n — 1 linearly 
independent values of y. 

Regardless of the size of n , for not terribly large x the probability 
becomes extremely close to 1 that a set of n + x random vectors from an 
(n — l)-dimensional subspace of the space of //-dimensional vectors, 
with components restricted to the modulo 2 integers 0 and 1, contains 
a linearly independent subset. This is obvious for ordinary vectors with 
continuous components, since the probability that a randomly selected 
vector in an (n — l)-dimensional space lies in a specified subspace of 
lower dimensionality is zero - it is certain to have a nonzero component 
outside of the lower-dimensional subspace. The argument is trickier 
here because components are restricted to only two values: 1 or 0. 

Introduce a basis in the full (n — l)-dimensional subspace of all vec¬ 
tors y with a • y = 0, so that a random vector in the subspace can be 
expressed as a linear combination of the basis vectors with coefficients 
that are randomly and independently 1 or 0. Arrange the resulting 
(n + x) random vectors of ones and zeros into a matrix of n + x rows 
and n — 1 columns. Since the row rank (the number of linearly inde¬ 
pendent rows) of a matrix is the same as the column rank, even when 
arithmetic is confined to the integers modulo 2, the probability that 
some subset of n — 1 of the n + x (n — l)-dimensional row vectors is 
linearly independent is the same as the probability that all n — 1 of the 
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(n + ^-dimensional column vectors are linearly independent. But it 
is easy to find a lower bound for this last probability 

Pick a column vector at random. The probability that it is nonzero 
is 1 — (l/2 n+x ). If so, take it as the first member of a basis in which we 
expand the remaining column vectors. The probability that a second, 
randomly selected column vector is independent of the first is 1 — 
(1 / 2” +x “ 1 ), since it will be independent unless every one of its (random) 
components along the remaining n + x — 1 vectors is zero. Continuing 
in this way, we conclude that the probability q of all n — 1 column 
vectors being linearly independent is 

q = ( 1-— Vl --—- 1-Y (G.2) 


(If you’re suspicious of this argument, reassure yourself by checking 
that it gives the right q when n = 3,^ = 111, and x = 0, by explicitly 
enumerating which of the 64 different sets of three j/s, all satisfying 
a • y = 0, contain two linearly independent vectors.) 

Finally, to get a convenient lower bound on the size of #, note that 
if we have a set of non-negative numbers a,b , c , ... whose sum is less 
than 1, then the product (1 — a){ 1 — b){ 1 — c)... exceeds 1 — {a + 
b + c + • • •). (This is easily proved by induction on the number of 
numbers in the set.) The probability q is therefore greater than 



(G.3) 


and this, in turn, is greater than 

1 

1 - r- 

2 X+1 


(G.4) 


So if we want to determine a with less than one chance in a million of 
failure, it is enough to run the subroutine n + 20 times. 










Appendix H 

One way to make a cNOT gate 


This more technical appendix is addressed to physicists curious about 
how one might, at least in principle, construct a cNOT gate, exploit¬ 
ing physically plausible interactions between two Qbits. Readers with 
no background in quantum physics will find some parts rather ob¬ 
scure. It is relevant only to readers curious about the possibilities for 
quantum-computational hardware, and plays no role in subsequent 
developments. 

The controlled-NOT gate Cio with control Qbit 1 and target Qbit 
0 can be written as 


Cio = HoC z H 0 , (H.l) 

where the controlled-Z operation is given by 

C z = i(l + Z! + Z 0 —ZjZo). (H.2) 

Because of its symmetry under interchange of the two Qbits, we may 
write C z without the subscripts distinguishing control and target. To 
within 1-Qbit Hadamard transformations, the problem of constructing 
a controlled-NOT gate is the same as that of constructing a controlled- 
Z gate. 

Since (C z ) 2 = 1, C z satisfies the identity 

exp(iC z 6) = cosO + iC z sin6f (H.3) 

We can therefore rewrite (H.2) as 

C z = -* exp(z'|-C z ) = -i exp[*(f )(1 + Z\ + Z 0 - ZjZq)] 

= exp[i (f )(Zi + Z 0 - ZjZo)]. (H.4) 

The point of writing C in this clumsy way is that the unitary 
transformations one can construct physically are those of the form 

U = exp(il~[t), (H.5) 

where HTi is the Hamiltonian that describes the external fields acting 
on the Qbits and the interactions between Qbits. So to within an over¬ 
all constant phase factor we can realize a C z gate by letting the two 
Qbits interact through a Hamiltonian proportional to Z\ + Zo — ZiZo 
for a precisely specified interval of time. If each Qbit is a spin-|, then 
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(since Z = a z ) this Hamiltonian describes two such spins with a highly 
anisotropic interaction that couples only their ^-components (Ising 
interaction) subject to a uniform magnetic field with a magnitude appro¬ 
priately proportional to the strength of their coupling. This is perhaps 
the simplest example of how to make a cNOT gate. 

Ising interactions, however, are rather hard to arrange. A much more 
natural interaction between two spins is the exchange interaction 



^(°) = ^( 1 )^( 0 ) 

i/V i/V 


+ 


o- (1 V (0) 

y y 


+ 


o- ( V 0) . 

2 2 


(H.6) 


which is invariant under spatial rotations, as described in Appendix B. 

One can also build a C gate out of two spins interacting through 
(H.6), if one applies to each spin magnetic fields that are along the same 
direction but have different magnitudes and signs. 1 2 

What we must show is that to within an overall constant phase factor 
it is possible to express C in the form 


C z = exp(il~tt), 


(H.7) 


with a Hamiltonian TL of the form 


n = J~a {X) • ^ (0) + But™ + B () cr ( g (H.8) 

for appropriate choices of J (known as the exchange coupling), of B\ 
and B 0 (proportional to the magnetic fields acting on the two spins - 
hereafter we ignore the proportionality constant and refer to them 
simply as the “magnetic fields”), and of the time t during which the 
spins interact with each other and with the magnetic fields. 

To see that the parameters in (H.8) can indeed be chosen so that 
7 ~i gives rise to C through (H.7), recall first that the operator ^(1 + 
7r ( ! ) • 7x ^) acts as the swap operator on any 2-Qbit computational- 
basis state: 


\(l + • ^ (0) )|xy> = \yx). (H.9) 

It follows from (H.9) that the three states (called triplet states) 

111), |00), -L(|01} + |10>) (H.10) 

are eigenstates of 7r ^ • 7r ^ with eigenvalue 1, while the state 

7|(|01>-|10>) (H.ll) 


1 What follows was inspired by Guido Burkard, Daniel Loss, David P. 
DiVincenzo, and John A. Smolin, “Physical optimization of quantum error 
correction circuits,” Physical Review B 60, 11 404-11 416 (1999), 

http://arxiv.org/abs/cond-mat/9905230. 

2 This was established in Equation (1.53). It is why the interaction is called the 
exchange interaction. 
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(called the singlet state) is an eigenstate of with eigen¬ 

value —3. 3 

The four states (H.10) and (H.ll) are also eigenstates of + 

<r^), the three triplet states (in the order in which they appear in 
(H.10)) having eigenvalues — 1, 1, and 0, and the singlet state having 
eigenvalue 0. 

Note also that the first two triplet states in (H.10) are eigenstates of 
^(crl 1 ^ — crl°^) with eigenvalue 0, while \{cr^ — crl°^) takes the third of 
the triplet states into the singlet state, and vice versa. 

So the eigenstates of the Hamiltonian 

H = J cr (1) • cr (0) + + Bo<J (0) 

= 3 ^ m . -(0) + 5 + i (<t (d + CT (0) } + B _^ {a m _ <T (0 ))j (H.12) 

where 


B± = B i ± B 0 , (H.13) 

can be taken to be the first two of the triplet states (H.10) and two 
appropriately chosen orthogonal linear combinations of the third triplet 
state and the singlet state (H.ll). The eigenvalues of 7 ~i associated 
with the first and second triplet states are J — B+ and J + B + ; those 
associated with the last two states are the eigenvalues of the matrix 

(3 B\ 

\ B - ~ 3 3J 

of TL in the space spanned by the last two; i.e. — J =t + B 1 . 

Now the four states (H.10) and (H.ll) are also eigenstates of C , 
the first of the three triplet states having eigenvalue — 1 and the other 
three having eigenvalue 1. Consequently these eigenstates of 7 ~i are 
also eigenstates of C z with respective eigenvalues —1, 1, 1, and 1. We 
will therefore produce C z (to within a constant phase factor) if we can 
choose the exchange coupling J , the magnetic fields B\ and Bo, and 
the time t during which 7 ~i acts to satisfy 

_ e it(J-B + ) _ e a(J+B + ) _ e it(-3+^/^3¥_) _ 

(H.H) 

The last equality is equivalent to 

e iit s [w3&_ = x or e ity/ 4J 2 +Si = ±1; ( H-1 5) 

the first is equivalent to 

e 2,tB+ = -1, or e ,tB + = ±»; (H.16) 


3 If |0) is the state | f) of spin-up along z, and 11) is | f), then the singlet state 
is the state of zero total angular momentum and the three triplet states are 
the states of angular momentum 1 with ^-components —H, 0, and/z. 
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and the second is equivalent to 

e~ litJ = e itB+ e ->tJW+Bl' (H.17) 

The identities (H.15) and (H.16) require the right side of (H.17) to 
be =h‘. For the (positive) time t for which the gate acts to be as small 
as possible we should choose —z, which gives 


Jt = tt/4. 


(H.18) 


With this value of t we can satisfy (H.15) (with the minus sign) and 

(H.16) (with the plus sign) by taking ^4J 2 + B 1 — 4J and B+ = 2 J. 

So we can produce the gate C (to within an overall constant phase 
factor) by taking the magnetic fields in the Hamiltonian (H. 12) and the 
time for which it acts to be related to the exchange coupling by 


B+ = 2jf, B- = 2V37, 
or, in terms of the fields on each spin, 


t = \n/J, 


(H.19) 


5 1= (1 + V3)X Bo = ( t = (H.20) 

Note the curious fact that although, as (H.2) makes explicit, the 
gate C z acts symmetrically on the two spins, the realization of C z by 
the unitary transformation e l7it requires the fields acting on the spins 
to break that symmetry. Of course the symmetry survives in the fact 
that the alternative choice of fields B\ = (1 — Bq = (1 + 

works just as well. 





Appendix I 

A little elementary group theory 


A set of positive integers less than N constitutes a group under multi¬ 
plication modulo TV if the set (a) contains 1, (b) contains the modulo- TV 
inverse of any of its members, and (c) contains the the modulo-prod¬ 
ucts of all pairs of its members. A subset of a group meeting conditions 
(a)-(c) is called a subgroup. The number of members of a group is called 
the order of the group. An important result of the elementary theory of 
finite groups (Lagrange’s theorem) is that the order of any of its sub¬ 
groups is a divisor of the order of the group itself. This is established 
in the next three paragraphs. 

If A is any subset of a group G (not necessarily a subgroup) and a is 
any member of G (which might or might not be in A), define a A (called 
a coset of A) to be the set of all members of G of the form g = as , 
where 5 is any member of A. (Throughout this appendix equality will 
be taken to mean equality modulo TV.) Distinct members of A give rise 
to distinct members of ^A, for if 5 and 5 ' are in A and as = as\ then 
multiplying both sides by the inverse of a gives 5 = 5 '. So any coset a A 
has the same number of members as A itself. 

If the subset A is a subgroup of G and 5 is a member of A, then every 
member of the coset 5 A must be in A. Since sS has as many distinct 
members as A has, sS = A. If two cosets a A and bS of a subgroup A have 
a common member then there are members 5 and 5 ' of A that satisfy 
as = bs\ so (as) A = (bs')S. But (as) A = a(sS) = a A, and similarly 
(bs')S = bS. Therefore aS = bS: two cosets of a subgroup are either 
identical or have no members in common. 

If A is a subgroup and a is a member of G, then since 1 is in A, 
a is in the coset aS. Since every member of G is thus in some coset, 
and since the cosets of a subgroup are either identical or disjoint, it 
follows that the distinct cosets of a subgroup A partition the whole 
group G into disjoint subsets, each of which has the same number of 
members as A does. Consequently the total number of members of G 
must be an integral multiple of the number of members of any of its 
subgroups A: the order of any subgroup A is a divisor of the order of the whole 
group G. 

Of particular interest is the subgroup given by all the distinct powers 
of any particular member a of G. Since G is a finite set, the set of distinct 
powers of a is also finite, and therefore for some n and m with n > m 
we must have a n = a m , or a^ n ~ m ^ = 1. The order of a is defined to be 
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the smallest nonzero k with a k = 1. The subset a, a 2 , ..a k of G is a 
subgroup of G, since it contains 1 = a k and the inverses and products 
of all its members. It is called the subgroup generated by a, and its order 
is the order £ of a. Since the order of any subgroup of G divides the 
order of G, we conclude that the order of any member of G divides the 
order of G. 



Appendix J 

Some simple number theory 


J.1 The Euclidean algorithm 

We wish to find the greatest common divisor of two numbers f and c , 
with f > c. The Euclidean algorithm is the iterative procedure that 
replaces f and c by f' — c and c' = f — \ f/c]c, where [ x ] is the 
largest integer less than or equal to x. Evidently any factors common 
to f and c are also common to f and c' and vice versa. Furthermore, 
f and c' decrease with each iteration and each iteration keeps f r > c\ 
until the procedure reaches c' = 0. Let fo and c o be the values of f 
and c at the last stage before c' = 0. They have the same common 
factors as the original f and c, and fo is divisible by c o, since the next 
stage is c q = 0. Therefore c o is the greatest common divisor of f 
and c . 


J.2 Finding inverses modulo an integer 

We can use the Euclidean algorithm to find the inverse of an integer c 
modulo an integer f > r,when f and c have no common factors. In this 
case iterating the Euclidean algorithm eventually leads to c$ = E This 
stage must have arisen from a pair f\ and c\ satisfying 1 = f\ — me\ 
for some integer m. But f\ and c\ are given by explicit integral linear 
combinations of the pair at the preceding stage, fi and q, which in 
turn are explicit integral linear combinations of fa and c 3 , etc. So one 
can work backwards through the iterations to construct integers j and 
k with 1 = jf + kc . Since k cannot be a multiple of /, we can express 
k as If + d with \ < d < f and with / an integer (negative, if k is 
negative); d is then the inverse of c modulo /. 


J.3 The probability of common factors 

The probability of two random numbers having no common factors is 

I T 

greater than for the probability is | that they are not both divisible 
by 2, | that they are not both divisible by 3, || that they are not both 
divisible by 5, etc. The probability that they share no prime factors 
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at all is 

n (i -1 tp 1 )=i / n o+ x /p 2 + i /p *+• • •) 

primes primes 

= 1/(1 + 1/2 2 + 1/3 2 + 1/4 2 + 1/5 2 + 1/6 2 + • • •) 
= 6/tt 2 = 0.6079... (J.l) 

If the numbers are confined to a finite range this argument gives only 
an estimate of the probability, but it is quite a good estimate if the range 
is large. 



Appendix K 

Period finding and continued 
fractions 


We illustrate here the mathematics of the final (post-quantum- 
computational) stage of Shor’s period-finding procedure. The final 
measurement produces (with high probability) an integer y that is 
within i of an integral multiple of 2 n /r, where n is the number of 
Qbits in the input register, satisfying 2" > N 2 > r 2 . Deducing the 
period r of the function f from such an integer y makes use of the 
theorem that if x is an estimate for the fraction j/r that differs from it 
by less than l/2r 2 , then j /r will appear as one of the partial sums in 
the continued-fraction expansion of a 1 . 1 In the case of Shor’s period¬ 
finding algorithm a 1 = y/2 n . If j and r happen to have no factors in 
common, r is given by the denominator of the partial sum with the 
largest denominator less than N. Otherwise the continued-fraction ex¬ 
pansion of v gives ry. r divided by whatever factor it has in common 
with the random integer j . If several small multiples of ro fail to be a 
period of /, one repeats the whole procedure, getting a different sub¬ 
multiple r\ofr. There is a good chance that r will be the least common 
multiple of ro and r\, or a not terribly large multiple of it. If not, one 
repeats the whole procedure a few more times until one succeeds in 
finding a period of /. We illustrate this with two examples. 

Example 1. (Successful the first time.) Suppose we know that the 
period r is less than 2 7 = 128 and that y = 11 490 is within ^ of an 
integral multiple of 2 14 /r. What is r ? 

Example 2. (Two attempts required.) Suppose we know that the in¬ 
teger r is less than 2 7 and that 11 343 and 13 653 are both within ^ of 
integral multiples of 2 14 /r. What is r ? 

In either example the fraction j/r for some (random) integer j will 
necessarily be one of the partial sums (defined below) of the continued- 
fraction expansion of j//2 14 , where y is one of the cited five-digit inte¬ 
gers. The partial sum with the largest denominator less than 128 is the 
one we are looking for. Once we have found the answer we can easily 
check that it is correct. 


1 Theorem 184, page 153, G. H. Hardy and E. M. Wright, An Introduction to 
the Theory of Numbers, 4th edition, Oxford University Press (1965). 
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The continued-fraction expansion of a real number a 1 between 0 and 
1 is 


1 

a 1 = - 

1 

a o H--- 

a\ H- 

d2 + * * * 


(K.1) 


with positive integers do, d\, d 2 ,... Evidently ao is the integral part of 
l/x. Let x\ be the fractional part of l/x. Then it follows from (K.l) 
that 


X\ 


1 


1 

a\ H--- 

^2 H-;- 

^3 T " ' 


(K.2) 


so a i is the integral part of l/x\. Letting X 2 be the fractional part of 
1/vi, one can continue this iterative procedure to extract a 2 as the 
integral part of 1/ X 2 , and so on. 

By the partial sums of the continued fraction (K.l), one means 


1 

ao 


1 

P 

do H- 

d i 


1 

-, etc. 

1 

do H-~ 

d\ H- 

d2 


(K.3) 


One can deal with both examples using an (unprogrammed) pocket 
calculator. One starts with l/x = 2 14 /j/ in the display and subtracts 
the integral part ao, noting it down. One then inverts what remains, 
to get 1 /jti, and repeats the process until one has accumulated a long 
enough list of a j . 


Analysis of example 1. We know that r < 128 and that x = 
11 490/2 14 is within ^2 -14 of j/r for integers j and r . Playing with a 
calculator tells us that 


11490/2 14 = 0.701293 945 3... 


1 + 


2 + 


1 


2 + 


1 


1 + 


7 + 


35 + 


• • • 


(K.4) 


If we drop what comes after the 35 and start forming partial sums 
we quickly get to a denominator bigger than 128. If we also drop 
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we find that 


11490/2 


14 


1 




i + 


l 


(K.5) 


2 + 


1 


2 + 


1 

1 + 7 


2 to || 
128, r = 77. And indeed. 


/ W/L m m # 

which works out to + . Since 77 is the only multiple of 77 less than 


2 14 x = 11490.079... 


which is within | of 11 490. 

Analysis of example 2. We know that the integer r is less than 128 
and that x = 11 343/2 14 and x' — 13 653/2 14 are both within |2 -14 of 
integral multiples of 1 /r. The calculator tells us that 


11343/2 14 = 


1 


1 + 


1 


(K.6) 


2 + 


1 


34- 


1 


1 + 


1 


• • • 


419 + 

Since 419 is bigger than 128 we can drop the ^ to get 

1 


1 + 


1 


(K.7) 


2 - 
L \ 


which gives y|, and indeed 


2 14 x ^ = 11342.769..., (K.8) 

which is within | of 11 343. The number r is thus a multiple of 13 
less than 128, of which there are nine. Had we the function f at hand 
(which we do in the case of interest) we could try all nine to determine 
the period, but to illustrate what one can do when there are too many 
possibilities to try them all, we take advantage of the second piece 


2 A more systematic way to get this is to use the famous but not transparently 
obvious recursion relation for the numerators p and denominators q of the 
partial sums: p n — a n p n -\ + p n - 2 , and q n — a n q n -\ + q n -i, with 
q 0 = ao, q\ = 1 + a$a\ and p$ — 1, p\ — a\. One easily applies these to the 
sequence a$, a\, ai, ... — 1, 2, 2, 1, 1, 7, 35, ..., stopping when one gets to a 
denominator larger than 100. 
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of information, which could have been produced by running Shor’s 
algorithm a second time. 

We also have 


13 653/2 14 


1 


1 + 


4 + 


1 + 


1 


1364 + • • • 


(K.9) 


Since 1364 is bigger than 128 we can drop the ^ to get 

—— (K.10) 



which gives and indeed 


2 14 x | = 13 653.333..., (K.ll) 

which is within ^ of 13 653. So r is also a multiple of 6 less than 100. 
Since 6 and 13 have no common factors the least multiple of both is 
6 x 13 = 78. Since there is no multiple of 78 less than 100 other than 
78 itself, r = 78. 









Appendix L 

Better estimates of success in 
period finding 


In Section 3.7 it is shown that with a probability of at least 0.4, a single 
application of Shor’s period-finding procedure produces an integer y 
that is within | of an integral multiple of 2 n /r, where r is the period 
sought. Since 2 n > TV 2 > r 2 , y/2 n is within 1 /(2r 2 ) of j/r for some 
integer /, and therefore, by the theorem cited in Appendix K, j / r and 
hence a divisor of r (r divided by any factors it may have in common 
with j) can be found from the continued-fraction expansion of j//2 w . 

What is crucial for learning a divisor of r is that the estimate for 
j /r emerging from Shor’s procedure be within 1 / 2r 2 of a multiple of 
1/r. Now when TV is the product of two odd primes p and q , as it is in 
the case of RSA encryption, then the required period r is not only less 
than TV, but also less than ^N. This is because — 1) is an integer, 
so it follows from Fermat’s little theorem, 

b p ~ l = 1 (mod p), (L.l) 

that 

h (t- l)(?-l)/2 = ! (mod p y (L>2 ) 

For the same reason it follows from 

b q ~ x = 1 (mod q) (L.3) 

that 

h (t-\)( q -m = j (mod ^). (L.4) 

But since p and q are prime, the fact that /; </ '~ r,( ‘ / ~ l,/2 — 1 is divisible 
by both p and q means that it must be divisible by the product pq , and 
therefore 


b^P = 1 (mod pq). 


So if 


b r = 1 (mod pq) 

and r exceeded ^ TV, then we would also have 


b r ^ = 0 (mod pq), 


(L.5) 

(L.6) 


(L.7) 
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and since r — \{p — 1 ){q — 1) > r — ^7V > 0, (L. 5) would give a pos¬ 
itive power of b smaller than r that was congruent to 1 modulo pq, so 
r could not be the period (which is the least such power). 

It follows that even if y is not the closest integer to an integral 
multiple of 2 n /r , if it is within 2 of such an integral multiple, then 

\y/2 n -j/r\<2/N 2 <l/2r 2 . (L.8) 

So for each j/r the algorithm will succeed in providing a divisor of r 
not only if the measured y is the closest integer to 2 n j/r , but also if it 
is the second, third, or fourth closest. Gerjuoy has estimated that this 
increases the probability of a successful run to about 0.9. 1 

Bourdon and Williams have refined this to 0.95 for large N and r } 
They also show that if one modifies the hardware, adding a few more 
Qbits to the input register so that n > 2«o + q, then for rather small 
q the probability of finding a divisor of r from the output of a single 
run of the quantum computation can be made quite close to 1. 


1 Edward Gerjuoy, “Shor’s factoring algorithm and modern cryptography. An 
illustration of the capabilities inherent in quantum computers,” American 
Journal of Physics 73, 521-540 (2005), 

http://arxiv.org/abs/quant-ph/0411184. 

2 P. S. Bourdon and H. T. Williams, “Sharp probability estimates for Shor’s 
order-finding algorithm,” 

http://arxiv.org/abs/quant-ph/0607148. 




Appendix M 

Factoring and period finding 


We establish here the only hard part of the connection between factoring 
and period finding: that the probability is at least ^ that if a is a random 
member of G pq for prime p and q, then the order r of a in G pq satisfies 
both 


r even (M.l) 

and 

a r / 2 ^ — 1 (mod pq ). (M.2) 

(In Section 3.10 it is shown that given such an a and its order r, the 
problem of factoring TV = pq is easily solved.) 

Note first that the order r of a in G Pq is the least common multiple 
of the orders r p and r q of a in G p and in G q . That r must be some 
multiple of both r p and r q is immediate, since a r = 1 (mod pq) implies 
that a r = 1 (mod p) and a r = 1 (mod q). Furthermore, any common 
multiple r' of r p and r q satisfies a r = 1 (mod pq), because if a r — 
1 + mp and a r = 1 + nq, then mp = nq. But since the primes p 
and q have no common factors this requires m = kq and n = kp, 
and hence a r = 1 + kpq = 1 (mod pq). Since r is the least integer 
with a r = 1 (mod pq), r must be the least common multiple of r p 
and r q . 

Consequently condition (M.l) can fail only if r p and r q are both 
odd. Condition (M.2) can fail only if r p and r q are both odd multiples 
of the same power of 2. For if r p contains a higher power of 2 than 
r q , then since r is a common multiple of r p and r q , it will remain a 
multiple of r q if a single factor of 2 is removed from it, and therefore 
a r / 2 = 1 (mod q). But this is inconsistent with a failure of condition 
(M.2), which would imply that a r ^ = — 1 (mod q). 

So a necessary condition for failure to factor TV = pq is that r p and 
r q are either both odd, or both odd multiples of the same power of 2. 
The first condition is absorbed into the second if we agree that the 
powers of 2 include 2° = 1. Our effort to factor N can fail only if we 
have picked a random a for which r p and r q are both odd multiples of 
the same power of 2. 

To calculate an upper bound for the probability of failure p{, note 
first that the modulo-^ and modulo-# orders, r p and r q , of a are the 
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same as the mod-^ and mod -q orders of the numbers a p and a q in G p 
and G q , where 


a = a p (mod p ), a = a q (mod q). (M.3) 

Furthermore, every number a in G Pq is associated through (M.3) with 
a unique pair from G p and G q . For if a p = b p and a q —b q then a — b 
is a multiple of both p and q, and therefore, since p and q are distinct 
primes, a — b is a multiple of pq itself, so a = b (mod pq). 

Since the (p — \)(q — 1) different members of G Pq are thus in one- 
to-one correspondence with the number of distinct pairs, one from the 
p — 1 members of G p and one from the q — 1 members of G q , the 
modulo-^ and modulo-# orders r p and r q of a random integer a in 
G Pq will have exactly the same statistical distribution as the orders r p 
and r q of randomly and independently selected integers in G p and G q . 
So to show that the probability of failure is at most y, we must show 
that the probability is at most y that the orders r p and r q of such a 
randomly and independently selected pair are both odd multiples of 
the same power of 2. 

We do this by showing that for any prime p, no more than half 
of the numbers in G p can have orders r p that are odd multiples of 
any given power of 2. (Given this, if P p ( /) and P q (j) are the prob¬ 
abilities that random elements of G p and G q have orders that are 
odd multiples of 2 7 , then the probability of failure pf is less than 
Ey>o p pU)PqU) < \ E/>0 P qU)=\-) This follows from the fact 
that if the order p — 1 of G p is an odd multiple of 2 k for some k > 0, 
then exactly half the elements of G p have orders that are odd multi¬ 
ples of 2 k . This in turn follows from the theorem that if p is a prime, 
then G p has at least one primitive element b of order p — 1, whose 
successive powers therefore generate the entire group. Given this the¬ 
orem - which is proved at the end of this appendix - we complete the 
argument by showing that the orders of the odd powers of any such 
primitive b are odd multiples of 2^, but the orders of the even powers are 
not. 

If ro is the order of b J with j odd, then 

1 = (b J ) r ° = b jr0 (mod p), (M.4) 

so />o must be a multiple of p — 1, the order of b . Since j is odd ro 
must contain at least as many powers of 2 as does p — 1. But since 
the order ro of any element must divide the order p — 1 of the group, 
ro cannot contain more powers of 2 than p — 1 does. So ro is an odd 
multiple of 2 k . On the other hand if j is even, then b J satisfies 


(pj^p-1)/2 = (pt-y' 2 = 1 ( mo d p). 


(M.5) 
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so the order ro of b J divides (p — l)/2. Therefore p — 1 contains at 
least one more power of 2 than does ro- 

This concludes the proof that the probability is at least | that a 
random choice of a in G Pq will satisfy both of the conditions (M.l) 
and (M.2) that lead, with the aid of an efficient period-finding routine, 
to an easy factorization of TV = pq, as described in Section 3.10. 

What remains is to prove that when p is prime, G p contains at 
least one number of order p — 1. The relevant property of the mul¬ 
tiplicative group of integers {1,2, 3, — 1} modulo a prime is 

that together with 0 these integers also constitute a group under ad¬ 
dition. This provides all the structure necessary to ensure that a poly¬ 
nomial of degree d has at most d roots. 1 We can exploit this fact as 
follows. 

Write the order 5 = p — 1 of G p in terms of its prime factors q : \ 


s = P - i = *r ( M - 6 ) 

For each q n the equation — 1=0 has at most s Iq l solutions, and 
since s /q t < s, the number of elements in G p , there must be elements 
a x in G p satisfying 


aj q ' ^ 1 (mod p). 


Given such an a x , define 



= a i 


s/{q l ) 


(M.7) 


(M.8) 


We next show that the order of b t is q " 1 * * . This is because 

b ] = a- = 1 (mod p). 


(M.9) 


so the order of bi must divide q"' and therefore be a power of q : , since 
q x is prime. But if that order were any power of qi less than zz*, then we 

$ I q b 

would have a i 1 = 1 (mod p) with k > 1, which contradicts (M.7). 

Because each b t has order q 7 - \ the product b\bi • • • b m has order 
q n \q n 2 "-q n m — P — l- This follows from the fact that if two numbers 
in G p have orders that are coprime, then the order of their product is 


1 This is easily proved by induction on the degree of the equation, using the 

fact that every nonzero integer modulo p has a multiplicative inverse modulo 
p. It is obviously true for degree 1. Suppose that it is true for degree m — 1 

and a polynomial P(x ) of degree m satisfies P(a) — 0. Then P{x ) = 0 

implies P(x) — P{a) — 0. Since P(x) — P(a) has the form Cj(x J — a 7 ), 

the factor x — a can be extracted from each term, leading to the form 

{x — a)Q(x), where jQfv) is a polynomial of degree m — 1. So if x ^ a then 
P(x) = 0 requires Q(x) = 0, and this has at most m — 1 distinct solutions 

by virtue of the inductive assumption. 
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the product of their orders. 2 Therefore since q[ x and q l 2 2 are coprime, 
bxb'i has order q[ x q ^ 1 . But since q\ x q 1 ^ 2 and qV are coprime, it follows 
that bibib 2 has order q\ x q 1 ^ 2 q\ z . Continuing in this way, we conclude 
that b\b 2 • • • has order q^q^ 2 • • • =5 = p — 1. 


2 Let zz, v , and w be the orders of r, z/, and cz/. Since = 1 (mod and 
(z-z/) w = 1 (mod />), it follows that d wu = 1 (mod ^). So the order v of d 
divides wu , and since v and zz have no common factors, v divides w. In the 
same way one concludes that zz divides w. Therefore, since v and zz are 
coprime, w must be a multiple of uv. Furthermore, 

(cd) uv = c uv d vu = 1 (mod p), so uv must be a multiple of w. Therefore 


w = uv. 




Appendix N 

Shor's 9-Qbit error-correcting code 


Shor demonstrated that quantum error correction was possible using 
the two orthogonal 9-Qbit codeword states 

| 0 > = 2 _ 3 / 2 (| 000 ) + | 111 »(| 000 > + 1111 >) (| 000 > + | 111 }), 

|T} = 2“ 3/2 (|000} - |111))(|000) - |111})(|000} - |111}). 

These can be viewed as an extension of the simple 3-Qbit codewords we 
examined in Section 5.2, making it possible to deal with 1-Qbit phase 
errors, as well as bit-flip errors. An encoding circuit for the 9-Qbit 
code - with an obvious resemblance to Figure 5.1 for the 3-Qbit code - 
is shown in Figure N.l. 

The form (5.18) of a general 1-Qbit corruption simplifies slightly 
when the state |*F) is a superposition of the codeword states (N.l), for 
it follows from (N.l) that 

Zol*) =Z!|*) = Z 2 |*), 

Z 3 |^> = z 4 |^> = z 5 |^>, (N.2) 

Z 6 |VF) =Z 7 |*> = Z 8 |*>. 

As a result, the general form of a 1-Qbit corruption of |*F) contains 
only 22 independent terms (rather than 28 = (3 x 9) + 1): 

\d)+\c)7 .[o + \c')2.%+\c")2.( ) + £(k}X,+|MY ( )) 1^}- 

(N-3) 

We diagnose the error syndrome with eight commuting Hermitian 
operators that square to unity: 



z„z,. z.z 2 , Z 3 Z ( , Z 4 Z S , Z„Z 7 , Z 7 Z 8 . 

X0X1X2X3X4X5, X 3 X 4 X 5 X 6 X 7 X 8 . 

All six Z-operators trivially commute with each other as do the two 
A-operators, and any of the six Z-operators commutes with any of the 
two A-operators because in every case the number of anticommutations 
between a Z l and an Ay is either zero or two. 

One easily confirms from (N.l) that |0), 11), and hence any super¬ 
position | \k) of the two, are invariant under all eight operators in (N.4). 
Each one of the 22 corrupted terms in (N.3) is also an eigenstate of 
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Fig N.1 


A circuit that 
transforms the 1-Qbit state 
\\fr) = o'| 0 ) + f$\ 1 ) into its 
9-Qbit encoded form 
|vj/) — of| 0 > + /3 11 ), where 
| 0 ) and 11 ) are given in 
(N.l.) Note the relation to 
the simpler 3-Qbit 
encoding circuit in Figure 

5.1. 



Fig N.2 


A circuit to 
measure the “error 
syndrome” for Shor’s 
9-Qbit code. The nine 
Qbits are the nine lower 
wires. The circuit is of the 
type illustrated in Figure 
5.7, but with eight ancillary 
Qbits (the eight upper 
wires) associated with the 
measurement of the eight 
commuting operators in 
(N.4), ZoZi, Z 1 Z 2 , 

Z 3 Z 4 , Z 4 Z 5 ,Z 6 Z7, z 7 z 8 , 
X 0 X 1 X 2 X 3 X 4 X 5 , and 
X 3 X 4 X 5 X 6 X 7 X 8 . 
Measurement of the eight 
ancillas projects the state of 
the nine lower Qbits into 
the appropriate 
simultaneous eigenstate of 
those eight operators. 



the eight operators in (N.4) with eigenvalues 1 or —1, because each of 
the eight operators either commutes (resulting in the eigenvalue 1 ) or 
anticommutes (resulting in the eigenvalue — 1) with each of the X M Y M 
and Z i . And each of the 22 terms in (N.3) gives rise to a distinct pattern 
of negative eigenvalues for the eight operators. 
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(a) The three errors Zo, Z3, and Z^ are distinguished from the X, and 
Y i by the fact that they commute with every one of the six Z-operators 
in (N.4). These three Z t can be distinguished from each other because 
Zo anticommutes with one of the two A-operators, Z6 anticommutes 
with the other, and Z3 anticommutes with both. 

(b) All nine errors X; are distinguished both from the Z* and from 
the Y* by the fact that they commute with both A-operators. They can 
be distinguished from each other because Xq, X2, X3, X5, Xo, and X 8 
each anticommutes with a single one of the six Z-operators in (N.4) 
(respectively Z 0 Z b ZiZ 2 , Z3Z4, Z4Z5, Z 6 Z 7 , and Z 7 Z 8 ) while X h X 4 , 
and X 7 each anticommutes with two distinct Z-operators (respectively 
Z0Z1 and ZiZ 2 , Z3Z4 and Z4Z5, and and Z 7 Z 8 ). 

(c) Finally, the nine errors Y l have the same pattern of commuta¬ 
tions with the Z-operators in (N.4) as the corresponding X* operators, 
permitting them to be distinguished from each other in the same way. 
They can be distinguished from the X* operators by their failure to 
commute with at least one of the two A-operators in (N.4). 

So, as with the other codes we have examined, the simultaneous 
measurement of the eight commuting operators in (N.4) projects the 
corrupted state onto a single one of the terms in (N.3), and the set of 
eigenvalues reveals which term it is. One then applies the appropri¬ 
ate inverse unitary transformation to restore the uncorrupted state. 
A circuit that diagnoses the 9-Qbit error syndrome is shown in 
Figure N.2. 



Appendix 0 

Circuit-diagrammatic treatment of 
the 7-Qbit code 


As a further exercise in the use of circuit diagrams, we rederive the 
properties of the 7-Qbit error-correcting code, using the method de¬ 
veloped in Chapter 5 to establish that the circuit in Figure 5.11 gives 
the 5-Qbit codewords. 

We start with the observation that the seven mutually commuting 
operators M,, l\f (i = 0, 1, 2) in (5.42), and Z in (5.49), each with 
eigenvalues ± 1 , have a set of 2 7 nondegenerate eigenvectors that form 
an orthonormal basis for the entire seven-dimensional codeword space. 
In particular the two codeword states |0) and 11) are the unique eigen¬ 
states of all the M* and l\f with eigenvalues 1, and of Z with eigenvalues 
1 and — 1 , respectively. 

It follows from this that if a circuit produces a state |^) that is 
invariant under all the M, and N, then |^) must be a superposition of 
the codeword states | 0 ) and 11 ), and if |*F) is additionally an eigenstate 
of Z then, to within factors e t<p of modulus 1, |4/) must be |0) or |1) 
depending on whether the eigenvalue is 1 or — 1 . 

Figure 0.1 shows that the state |^) produced by the circuit in 
Figure 5.10 is indeed invariant under Mo = X 0 X 4 X 5 X 6 . This figure 
demonstrates that when Mo is brought to the left through all the gates 
in the circuit it acts directly as Zo on the input state on the left, which 
is invariant under Zo. The caption explains why essentially the same 
argument applies to the other M,: when brought all the way to the 
left, Mi reduces to Zi acting on the input state, and M2 reduces to Z2. 
Figure 0.2 similarly establishes the invariance of |4/) under the three 
N*. 

Figure 0.3 establishes that the effect of Z = Z0Z1Z2Z3Z4Z5Z6 act¬ 
ing on the right is the same as that of Z3Z4Z5Z6 acting on the left. 
But since Z 6 , Z 5 , and Z 4 all act on the 1-Qbit states |0) this leaves 
only Z 3 which converts | \jf) to Z| \/f), which multiplies by (— \) x when 
|i Jf) = \x). This shows that, as required, Z|^) = (— 1) X |'F) when 

W) = k>. _ 

Figure 0.4 establishes that the effect of X = X 0 X 1 X 2 X 3 X 4 X 5 X 6 act¬ 
ing on the right is the same as that of Z0Z1Z2X3 acting on the left. But 
since Zo, Zi, and Z 2 all act on the 1-Qbit states |0) this leaves only 
X 3 which interchanges |1) and |0) when | x//} = \x). This shows that 
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IO> 
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I w> 
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X 


(b) 


(a) 


Fig 0.1 


Demonstration that the state |T) constructed by the circuit 
in Figure 5.10 is invariant under Mq = X 0 X 4 X 5 X 6 . We exploit the fact 
that bringing an X, acting on the control Qbit of a cNOT, from one 
side of the cNOT to the other introduces an additional X acting on the 
target Qbit (and the fact that an X acting on the target Qbit commutes 
with the cNOT). Bringing the X acting on Qbit 0 to the left of the 
three cNOT gates, represented by the controlled triple-NOT on the 
right, introduces X operators on all three target Qbits, which combine 
with the three X already acting on those Qbits to produce unit 
operators. So all four X gates on the right reduce to Xq, as indicated in 
inset (a). That Xq can be moved further to the left through Hq, if it is 
changed into Zo, as shown in inset (b). So Mo acting on the extreme 
right is equivalent to Zq acting on the extreme left. Since Zq leaves the 
1-Qbit state |0) invariant, |T) is invariant under Mq. A similar 
argument applies to Mi = X 1 X 3 X 5 X 6 : the X, all commute with the 
first controlled triple-NOT on the right, and then produce a single Xi 
when moved through the middle controlled triple-NOT, resulting in 
Zi when moved the rest of the way to the left. Similarly, 

M 2 = X 2 X 3 X 4 X 6 produces Z 2 when moved all the way to the left. 
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(b) 


(a) 


Fig 0.2 


Demonstration that the state | T) constructed by the circuit 
in Figure 5.10 is invariant under Nq = Z 0 Z 4 Z 5 Z 6 . We exploit the fact 
that bringing a Z, acting on the target Qbit of a cNOT, from one side 
of the cNOT to the other introduces an additional Z acting on the 
control Qbit (and the fact that a Z acting on the control Qbit 
commutes with the cNOT). So bringing Z 4 , Z 5 , and 2 \ to the left of 
all three cNOT gates represented by the controlled triple-NOT on the 
right introduces three Z operators on the control Qbit 0, which 
combine with the Zq already acting to produce the unit operator, 
reducing the collection of four Z gates on the left to the three Z acting 
on Qbits 4, 5, and 6 , as indicated in (a). Those Z can be moved all the 
way to the left, always producing a pair of Z gates on the control Qbits 
of the multiple cNOT gates they move through, until they act directly 
on the input state as Z 4 Z 5 Z 6 , which leaves it invariant. A similar 
argument shows that Ni = Z 1 Z 3 Z 5 Z 6 acting on the extreme right is 
the same as Z 5 Z 6 acting on the extreme left, and that N 2 = Z 2 Z 3 Z 4 Z 6 
on the right is the same as Z 4 Z 6 on the left. 
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(e) (d) (c) (b) (a) 


Demonstration that Z = Z0Z1Z2Z3Z4Z5Z6 acting on the 
right of the circuit in Figure 5.10 is the same as Z 3 Z 4 Z 5 Z 6 acting on 
the left. Since Z4, Z5, and all act as the identity on the 1-Qbit states 
|0) this leaves only Z3 which converts \\j/) to Z|t/a). This results in a 
factor of (—l) v when \\/f) = \x), showing that Z|*F) = (—l) 1 !^) when 
\xfr) = \x). 


Fig 0.3 
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X 


X 


X 


Fig 0.4 


Demonstration that X = X 0 X 1 X 2 X 3 X 4 X 5 X 6 acting to the 
right of the circuit in Figure 5.10 is the same as X 3 Z 2 Z 1 Z 0 acting to 
the left. Since Z 2 , Zi, and Zo all act as the identity on the 1-Qbit states 
|0) this leaves only X 3 which converts | ifr) to X|t/t). When \ i/r) = \x) 
this interchanges |0) and |1), and therefore X interchanges the 
corresponding states produced by the circuit. 
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-lo> 
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-lo> 

<01- 


lo> 

<01- 

H 

-lo> 

< 01 - 

H 

— lo> 

< 01 - 

H 

- lo> 


X interchanges the corresponding states produced by the circuit. It 
also establishes that if |T) differs by a phase factor e t(p from |0) when 
\\j/) = |0), then it will differ by the same phase factor from |1) when 

m = \iy 

It remains to show that when | \j/) = 10) in Figure 5.10, the resulting 
state is given by |0) without any nontrivial phase factor e t<p . Since 1 0)7 
appears in the expansion of 10) with the amplitude 1 /2 3 ^ 2 , we must show 
that when the input to the circuit in Figure 5.10 is 10) 7 the inner product 
of the output with 10)7 is 1/2 3 / 2 , without any accompanying nontrivial 
e l(p . This is established in a circuit-theoretic manner in Figure 0.5, as 
explained in the caption. 


Fig 0.5 


Demonstration 
that the state produced by 
the circuit in Figure 5.10 
when | \jr) = | 0 ) has an 
inner product with the 
state | 0)7 that is 1 / 2 3 ^ 2 , 
thereby establishing that 
the state is precisely 11 ) 
without any additional 
phase factor. We sandwich 
the circuit of Figure 5.10 
between | 0)7 and 7 ( 0 |, 
following the procedure 
developed in Figure 5.19. 
Since all the cNOT gates 
have | 0 ) for their control 
bits, they all act as the 
identity. The diagram 
simplifies to the form on 
the right, consisting of four 
inner products ( 0 | 0 ) = 1 
and three matrix elements 

( 0 |H| 0 ) = 1/V2. So the 

inner product is indeed 

l/2 3/2 . 




































Appendix P 

On bit commitment 


Alice prepares n Qbits in a computational basis state |v), applies a 
certain n -Qbit unitary transformation U to the Qbits, and then gives 
them to Bob. If Bob knows that all 2 n values of x are equally likely, 
what can he learn from the Qbits about Alice’s choice of U? 

The answer is that he can learn nothing whatever about U . The most 
general thing he can do to acquire information is to adjoin m ancillary 
Qbits to the n Alice gave him (m could be zero), subject them all to 
a quantum computation that brings about an (n + m )-Qbit unitary 
transformation W, and then measure all n + m Qbits. The state prior 
to the measurement will be 

I'P*) = w((U|x>) 0 |<D)), (P.l) 

where | O) is the initial state of the m ancillas and all 2 n values of x from 
0 to 2 n — 1 are equally likely. The probability of Bob getting z when he 
measures all n + m Qbits is 

p(z) = (l/2^y>|vl/,>(*,|s) = (l/2”)(z\ £(|V,)<V,|)|*>. 

(P.2) 


We have 


V x )(Vx 


and since 


w((U|x)(x|U f ) <g> (|<I>} <3>|)W, 


(P.3) 


we then have 




X>><xl = l. (P.4) 


W^(UU t )®(|4>)<4>|))w t = W^l® (|<I>) (4>|))wX 


(P.5) 


We see from (P.2) and (P.5) that U has dropped out of the prob¬ 
ability p(z ), so the outcome of Bob’s final measurement provides no 
information whatever about Alice’s unitary transformation. 

In the application to bit commitment in Section 6.3, Alice’s unitary 
transformation U is either the w-Qbit identity or the tensor product 
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of n 1-Qbit Hadamards, H 0W , and the random /z-Qbit state \x) arises 
from the tensor product of n 1-Qbit states, each of which is randomly 
|0) or 11). 

One might wonder whether Bob could do better by measuring some 
subset of all the Qbits at an intermediate stage of the computation, 
and then applying further unitary transformations to the unmeasured 
Qbits conditional upon the outcome of that measurement. But this, 
by an inversion of the Griffiths-Niu argument in Section 3.6, would 
be equivalent to first applying an appropriate multi-Qbit controlled 
unitary gate, and only then measuring the control Qbits. That gate can 
be absorbed in W and the subsequent measurement of its control Qbits 
deferred to the end of the computation. So this possibility is covered 
by the case already considered. 
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chorus, rhapsodic, 38 


circuit diagram, 21-23; for 

Bernstein-Vazirani problem, 53; for 
controlled- U gate, 60; for dense coding, 
148; for five-Qbit code, 129; for 
measurement gates, 25-26; for multiply 
controlled operations, 94-97; for 
nine-Qbit code, 208; for quantum 
Fourier transform, 76-80; for 
seven-Qbit code, 127; for teleportation, 
152; for Toffoli gate, 61 
classical basis, 18, see also basis 
classical computer, 1; reversible 28, 36-37, 
58 

cloning, see no-cloning theorem 
cNOT, see controlled Not 
c-number, 3 

codepad, one-time, 138-140 
coding, dense, 146-149 
collapse of state, 26 
complexity, computational, 2 
computational basis, 18, see also basis 
computational process U/, 36-37, 

46-50 

computer, classical, 1 
computer, quantum, 1-3 
confusion, possibilities for, 3, 11, 

22, 25 

constructing codewords, five-Qbit, 

119-120, 128-135; nine-Qbit, 208; 
seven-Qbit, 123-124, 127-128 
constructing states, 32-34 
continued fractions, 82, 197-198 
control Cbit, 9-10, 14 
controlled NOT, 9-10; diagrams for, 50, 
120; making a gate, 189-192; multiply 
controlled, 58-61, 94-97 
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density matrix, 18 
Deutsch’s problem, 41-46, 49, 

183-186 

diagram, circuit, see circuit diagram 
digital computation, 31, 85 
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fault tolerance, 127 
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flip, 9, see also bit-flip error 
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measurement, 23-32; «-Qbit, 21 
generalized Born rule, see Born rule 
GHZ (Greenberger Horne Zeilinger) 
puzzle, 154-158 
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group theory, 64-68, 193-194, 203-206 
Grover iteration, 89-94 
Grover search algorithm, 88-98 

H, see Hadamard transformation 
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on cNOT, 14, 54; in error correction, 
114-116; w-fold, 51, 72; in quantum 
Fourier transform, 72-80; in seven-Qbit 
code, 125-126; and superposition of all 
inputs, 37-38 
happen, what didn’t, 158 
Hardy state, 175-180 
Hermitian matrix, 15n 
Hermitian operator, 163-164 
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input register, 36 

interactions, 1, 58, 99-100, 109, 111, 138, 
189-190 

interchanging target and control Qbits, 14 
inverses in modular arithmetic, 65-69, 195 
irreversible operations, 8, 24, 36 


ket vector, 22, 161-162, 166 

Lagrange’s theorem, 193 
linear operator, 161; adjoint, 162; functions 
of, 165; norm-preserving, 19; outer 
product of two vectors, 165; reversal of 
order in circuit diagrams, 22; tensor 
product of, 164; unitary, 19-20. 
linear transformation, see linear operator 

macroscopic, 1 
magic, 8, 36, 38 

many worlds, see rhapsodic chorus 
mathematicians, disapproval of Dirac 
notation, 160 

matrix, density, 18, 110, 138-140 
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matrix, Pauli, 15-16, 168-172 
measurement, 8, 23-26, 28-30, 181-182; 
in Bell basis, 147; of control Qbit, 

77-78; of operators that square to unity, 
115; and state preparation, 30-32 
measurement gate, 23-32 
mixed state, 18 
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modulo TV arithmetic, 64, 193-194 
multiply controlled operations, 58-61, 
94-97 

n, see number operator 
nine-Qbit error-correcting code, 207-209 
no-cloning theorem, 39-40, 70, 103, 150 
nonlocality, quantum, see quantum 
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normalization, 17, 24, 159 
NOT (X), 8-9; square root of, 59-60 
notation, Dirac, see Dirac notation 
number operator (H), 11-12, 173 
number theory, 64-68, 82-83, 86-87, 
88-89, 195-196, 197-198, 201-202, 
203-206 

one-time codepad, 138-140 
operator, see linear operator 
operator basis, 166 

order, of a group, 65, 193; of a member of a 
group, 65, 193-194; of a subgroup, 193 
outer product, 165 
output register, 36 

parallelism, quantum, 37-39, 69, 84 
Pauli matrices 15-16, 168-172 


period finding, 55, 63-64; and continued 
fractions, 197-198; estimates of success, 
201-202; and factoring, 86-87, 203-206; 
and phase errors, 84-86; with quantum 
computer, 68-71, 83-84; and quantum 
Fourier transform, 71-83; and RSA 
encryption, 64-69; in searching, 98 
permutations, as reversible operations on 
Cbits, 9, 19; extended from Cbits to 
Qbits, 19-20; in quantum Fourier 
transform, 75, 77, 80 

phase errors, in error correction, 100, 112, 
207; in quantum Fourier transform, 
84-86 

phase factor, 164 

philosophy, 40, 145 

photon, 100, 110, 137-140, 143-144, 

149 

physicists, irritating practice of, 9; remarks 
addressed primarily to, 13n, 15n, 23n, 

7In, 80n,189-192 
polar decomposition theorem, 34 
polarization, 110, 138-140 
POVM, 24n 

preparation of state, 30-32 
product, inner, see inner product 
product, outer, 165 
product, tensor, see tensor product 
product, vector, 16 
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probability, of measurement outcomes, 
24-30, see also Born rule; in number 
theory, 67n, 87, 195-196, 203-206; in 
quantum computation, 54-55, 57, 
80-83, 88; of success in Simon’s 
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projection operators, 165; and Born rule, 
28 

pure state, 18n 

Qbit, 3-4; compared with Cbit, 34-35; 
extracting information from, 23-27; 
operations on, 19-21; spelling of, 4; 
states of, 17-19 
q-number, 3 
quantum computer, 1-3 
quantum cryptography, see cryptography 
quantum Fourier transform, see Fourier 
transform 

quantum mechanics, why easily learned by 
computer scientists, xii-xiii 
quantum nonlocality, 177, see also spooky 
action at a distance 
quantum parallelism, 37-39, 69, 84 
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175-180 
quaternions, 16 
qubit, 4, see also Qbit 
qunumber, ungainliness of, 4 
Qutip, absurdity of, 4 

reality, element of, 156 
reduction of state, 26 
register, input, 36 
register, output, 36 

relational information, 40-41, 56-57; in 
error correction, 103-109 
reversed convention in circuit diagrams, 
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reversible, classical computation, 28, 

36-37, 58; operations on Cbits, 8-11; 
operations on Qbits, 19-20; why 
quantum computer must be, 46-50 
rhapsodic chorus, 38 

rotations, in Grover algorithm, 92-94; and 
Pauli matrices, 16, 168-174, 179-180 
RSA encryption, 63-64, 66-69 

salis, cum grano, 82 

Schmidt decomposition theorem, 34 

searching, 88-98; for one of four items, 98; 

for several marked items, 96-98 
self-adjoint, 163, see also Hermitian. 
seven-Qbit error correcting code, 121-128, 
210-215 

Shor algorithm, see period finding 
Simon’s problem, 54-58, 63, 187-188 
singlet state, 190-191 


SO(3), 172 

spin, 16, 110, 174, 189-192 
spooky action at a distance, 154-158, 
175-180 

state, of Cbits, 3-8; general 1-Qbit, 
173-174; of Qbits, 17-19; of Qbits and 
Cbits, compared, 35 
state construction, 32-33 
state preparation, 30-32 
SU(2), 172 
subgroup, 193 
subroutines, 36, 46-50 
superposition, 17-18; action of quantum 
Fourier transform on, 71; of all possible 
inputs, 37-38; naive misunderstanding 
of, 26-28 

swap, 9-10, 12-13; constructed with Pauli 
matrices, 15-17 

syndrome, error, see error syndrome 
target Cbit, 9-10 

target Qbit, 14; of controlled- U gate, 59; of 
Toffoli gate, 58 

teleportation, 149-154; of entanglement, 
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tensor product of states, 6-8, 18, 164; of 
operators, 10-11, 164 
theorem, Bell’s, 136, 154; continued 
fractions, 82, 197-199; Fermat’s little, 
65, 201; Lagrange’s, 193; no-cloning, 
39-40; on primitive generators, 204; 
Schmidt (polar) decomposition, 34n 
three-Qbit error-correcting code, 

100-109 


Toffoli gate, 58-62; in error correction, 95; 

in Grover algorithm, 94-96 
trace, 170 

transformation, linear, 161 
transpose, 163 
triplet state, 190-191 

U/, 36-37; in the presence of subroutine 
Qbits, 46-50 
U/- T , 71-76 
uncertainty principle, 40 
unitary transformation, 19-21, 161-164; 
general 1-Qbit, 168-172 

vector, bra, 22, 161-162, 166 
vector, ket, 22, 161-162, 166 
vector product, 16 
vector space, 159-167 
von Neumann measurement, 23n 

Walsh-Hadamard transformation, see 
Hadamard transformation 
weirdness, quantum, xiii, 39, 154-158, 
175-180 
wire, 21 

worlds, many, see rhapsodic chorus 

X, see NOT 

XOR, 10, 50, 138-139 

Y, 15, 112 
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